Historic, archived document 

Do not assume content reflects current 
scientific knowledge, policies, or practices. 






KSr. 



■ ■,'^_.- ■""■.. ' ■»■*■■■ 



Table of Contents 







WESTAT 

An Emptoye^-Owned Reseofch CorpofoHon 



WE STAT 



Table of Contents 



An Employee- OvA/ned Heseanch Corporation 



1 650Researc-iB(va "Hock vme. MD 30850-31 29»30i 251 ' 500 



October 20, 1987 



Patrick Brannen 
Office of Family Assistance 
Family Support Administration 
BllO Transpoint Building 
2100 Second Street, S.W. 
Washington, D.C. 20201 

Dear Pat: 

This report is submitted in response to contracts 600-84-0262 and HHS-100-87-0009. It 
is a revision of an earlier version that w^as submitted for your review and comment on 
October 2, 1986. However, funds were not available for completing and revising the report 
until several months later. 

This report examines and evaluates the validity of statistical methods in use in 
AFOC in estimating overpayment errors, and in computing disallowances from those 
estimates. It does not consider, except for one special case, and otherwise in a very limited 
way, approximately optimum sample sizes. Also, it does not consider the possible 
procedures for improving the efficiency of sampling through stratification, through 
improved allocation of the sample sizes to the state and Federal samples, or other means for 
reducing sampling errors without increasing total costs ~ these v^dll be the subject of a later 
study. 

We note that summaries and details of a number of analyses are included in this 
report. For most of these, as well as other analyses that have been completed but that are 
not included in the report, we have diskettes that can be made available for anyone who 
wishes to obtain the information in this form for further analysis or other purposes. 

It has been a real pleasure working with you, and with other OFA staff members in 
this study. We wish to thank especially, in addition to yourself. Sue Osman, Sean Hurley, 
John Bowes and Debbie Chassman (who participated before she left OFA), for their many 
hours of insightful discussion of the issues involved, and for reviewing and commenting 
on and guiding us in various phases of the study as it has progressed. We also want to 
express our thanks to Jacqueline Swope Nemes for the outstanding job she did in 
preparation of the manuscript, and in production of the report. 



Sincerely, 

Morris H. Hansen 
Chairman of the Board 



MH/jsn 



Table of Contents 



1^ 



A STATISTICAL EVALUATION 
OF AFDC QUALITY CONTROL 



Prepared by: 

Morris H. Hansen 
and Benjamin J. Tepping 



Westat, Inc. 

1650 Research Boulevard 

Rockville, Maryland 20850 



Prepared for: 

Office of Family Assistance 

Family Support Administration 

Washington, D.C. 20201 



October 1987 



West at, Inc. 



A STATISTICAL EVALUATION OF AFDC QUALITY CONTROL 

TABLE OF CONTENTS 

Chapter Page 



1 INTRODUCTION AND SUMMARY 1-1 



1.1 Introduction 1-1 

1.2 Some Summary Resxilts and Conclusions 1-9 



STATISTICAL VALIDITY OF AFDC-QC METHODOLOGY 2-1 



2.1 Test Populatioiw 2-1 

2.2 Evaluation of the Regression Estimator 2-4 

2.3 Evaluation of Computed Confidence Intervals 2-7 

2.4 An Improved Procedure for Computing 

Confidence Botmds 2-14 

2.5 Some Further Considerations for Estimating 

Sampling Error 2-17 

2.5.1 Pooled Variance Estimates 2-21 

2.5.2 Implications for the Choice of 

Variance Estimators 2-25 

2.5.3 Note on Computation of Characteristics of 
Confidence Intervals Using the Pooled 

Variance Estimator 2-29 

2.6 Conclusions on the Validity of the 

Regression Methodology 2-30 



CONSIDERATIONS IN CHOICE OF LOWER 
CONFIDENCE BOUND VERSUS POINT ESTIMATE IN 
DETERMINING DISALLOWANCES 3-1 



3.1 Introduction 3-1 

3.2 Use of Point Estimate Versus Lower Confidence 

Bound in Computing Annual Disallowances 3-3 

3.3 Some Implications and Issues Concerning 

Use of the Lower Coi\fidence Botmd 3-9 

3.3.1 Choice of Nominal Confidence Level 3-9 

3.3.2 Improved Procedures for Computing 

Confidence Bounds 3-10 

3.3.3 Comparative Precision of Lower Confidence 

Bound and Point Estimate 3-11 



Table of Contents 



Table of Contents 



Section 



Page 



3.3.4 Controlling Impact of Sample Size and of 

Poor-Quality QC Work on Lower Confidence 

Bound 3-12 

3.4 Some General Considerations on Optimum 

Sample Size 3-13 

3.5 Optimum Sample Size for Computing 

Disallowances 3-15 

3.6 The Impact in FY 1981 of Three Disallowance 

Rules - Rules A, B, and C 3-18 

3.7 An Alternative Rule for Computing 

Disallowance - Rtde D 3-19 

3.8 Summary 3-30 



List of Appendices 



Appendix 



Page 



A 


Description of the Three Test Populations and 

the Sampling Procedure Used in Simulations 


A-1 




B 


Evaluation of the Regression and 

Difference Estimators 


B-1 




C 


Computation of Confidence Intervals 


C-1 




Technical Note for Appendix C: A Note on 
Confidence Intervals 


C-19 








D 


Reliabilitv of Lower Confidence Boimds 


D-1 








E 


Exploration of Some Alternative Procedures 

for Computing Pooled Variance Tistimatfts , , 


E-1 








Technical Note for Appendix E 


E-17 


F 


Optimum Sample Size for Disallowances 

Based on Point Estimates 


F-1 







Technical Note for Appendix F F-6 



Table of Contents 



Westat, Inc. 



Appendix 



Page 



Optimum Sample Size for Disallowances 

Based on Lower Confidence Boimds G-1 

Rule D for Computing Disallowances 

Based on Accumulations Across Years H-1 

Effect of Substihiting T for T in 

Estimating Overpayment Error Rates I-l 



H 



List of Tables 



Table 
2-1 

2-2 

2-3 

2-4 



2-5 



2-6 



Some characteristics of the test populations 

and of the full AFDC population (1982) 2-3 

Sample sizes by state for 12-month period 

ending September 30, 1982. 2-5 

Evaluation of regression estimator based on 
computations for 1000 independent samples 
drawn from Test Popiilation A 2-8 

Proportion of observed samples in which value 

being estimated was above, below, or covered 

by specified nominal confidence bounds, 

for Test Populations A, B, and C 2-9 

Percentage distribution of overpayment errors 

as deterxnined by the state and Federal 

evaluation for Test Population A 2-11 

Proportion of samples in which the true error rate is above, 
below or covered by specified nominal confidence intervals, 
based on logarithmic transformation of Jackknife replicate 
estimates. Population A 2-17 



m 



Table of Contents 



Table of Contents 



°1 



Table 

2-7 



Page 



2-8 



3-1 



3-2 



3-3 



3-i 
3-5 

3-6 

3-7 

3-8 

3-9 

A-1 

A-2 



Approximate coefficients of variation of sr and s^ from 

1000 samples drawn from Test Populations A, B, and C for 
alternate sample sizes, compared with samples drawn from 
normal distributions 



Properties of alternative procedures for computing of 
confidence intervals for R, for Population A 



Some illustrative approximate average results over 
repeated samples for annual disallowances computed by 
Rules A, B, and C 



Average annual disallowances, computed for Rule B, 
for specified sample sizes 



Approximate optimum Federal sample sizes (n*) for 
computing aimual disallowances based on a lower 
confidence bound (Rule C), for alternate levels of total 
Federal payment, and of excess of overpayment error 
rate over the target rate 



Disallowances based on alternative rules, FY 1981. 



Expected accumulated disallowances compared for 
Rules A, C, and D 



Summary of aggregate disallowances from application 
of Rule D to eligible states, 1981-1984 



States reaching full settlement by or before 1984, if Rule D 
were iiutiated in 1981, and if a 15 percent estimated cv 
were adopted as the criterion 



States reaching full settlement by or before 1984, if Rule D 
were initiated in 1981, and if a 10 percent estimated cv 
were adopted as the criterion 



2-18 



2-26 



3-4 



3-8 



Application of Rule D to states . 

Statistics for Population A 

Statistics for Population B 



3-17 
3-20 

3-24 

3-27 

3-28 

3-29 
3-31 
A-3 
A-4 



IV 



Table 



Table of Contents 

■••- 

Weatat, Inc. 



Page 



A-3 Statistics for Population C A-5 

A-IA Cases in Population A that state found ineligible, 

with Federal finding A-6 

A-IB Cases in Population B that state found ineligible, 

with Federal finding A-7 

A-IC Cases in Population C that state found ineligible, 

with Federal finding A-8 

A-2A Cases in Population A for which the state 

found no error or only underpa)anenL A-9 

A-2B Cases in Population B for which the state 

found no error or only underpaymenL A-10 

A-2C Cases in Population C for which the state 

found no error or only underpa3nnent. A-11 

A-3A Cases in Population A for which the state 

found eligible but overpa)anent A-12 

A-3B Cases in Population B for which the state 

found eligible but overpa)anent A-13 

A-3C Cases in Poptilation C for which the state 

found eligible but overpayment A-14 

B-1 Average values of estimated payment error rate R and its 
estimated standard deviation bi^ed on 1000 independent 
samples from Population A, by estimator and sample size B-10 

B-2 Variance of the estimated payment error rate and the 

average of estimates, by estimator and sample size B-11 

B-3 Some summary statistics for 1000 simulations 

for Population A, B, and C B-12 

B-4 Summary measures for distribution of residuals, for 

regression of T on y B-13 



Table of Contents 



Table of Contents 



Table 
C-1 



C-2A 
C-2B 
C-2C 
C-3 



C-4 



C-5 



C-6 



D-1 



D-2 



E-1 



E-2 



E-3 



Page 

Correlation of R and sg, coefficients of variation of sg and 

of p, estinxated from 1000 independent samples of Test 

Populations A, B, and C, for various sample sizes C-2 

Population A: Summary statistics C-9 

Population B: Summary statistics C-10 

Population C: Summary statistics C-11 

Estimated coverage of 95 percent and 90 percent nominal 
confidence intervals for three test populations, based on 

alternative regression estimators using the estimated p and 

a minimum p of .8 C-12 

Coverage of coiiRdence intervals by logarithmic Jackknife, 
Population A C-13 

Bias of nominal coverage probabilities, for samples from 

a skewed distribution C-23 

Tail coverages as estimated by simulation and as given 

by the normal model C-26 

Variances and standard errors of 95 percent lower 

confidence bounds and of estimated payment error rates, 

for regression estimator, for three test populations for 

seven illustrative sample sizes D-1 

Distribution of states by the estimated correlation 

between state and Federal findings for 

fiscal years 1981-1984, for 44 states D-6 

Estimated vmweighted correlations of true imit variance of 

^ for 1983 with estimated unit variances for 1984 E-11 

Data and results of regression estimates of variance, 

by states E-32 

Composite estimates of unit variance, 

using zero squared bias, by states E-33 



VI 



Table of Contents 



Table 



Weatat, Inc. 



Page 



E-4 Composite estimate of unit variance, 

using high estimate of average squared bias, by states E-34 

E-5 Pooled imit variance estimates, by states E-35 

F-1 Slope of the Federal gain fimction of the state sample size F-7 

G-1 Rough approximation to optimum size of the 
Federal subsample if the only consideration 
were the net return from disallowances G-6 

G-2 Expected net gain from disallowances, based 

on simulations from Population A G-7 

H-1 

to Federal withholding. Rule D, Examples 1 to 16 H-8 

H-16 to 23 



I-l Comparison of variances of x"/ 1 and x"/T 

for Population A 1-3 



List of Figures 

Fi gure Page 

2-1 Mean findings of dollar error per case in 1000 independent 

samples for each of four sample sizes. Population A 2-6 

2-2 Distribution of estimated payment error rate 2-13 

B-1 Mean findings of dollar error per case in 1000 independent 

samples for each of four sample sizes. Population B B-2 

B-2 Mean findings of dollar error per case in 1000 independent 

samples for each of four sample sizes. Population C B-3 

C-1 Distribution of estimated standard error C-14 

C-2A Scatterplot for Population A - sample size 1 C-15 



vu 



Table of Contents 



1^. 



Table of Contents 



Fi gure Page 

C-2B Scatterplot for Population A — sample size 2 C-16 

C-2C Scatterplot for Population A - sample size 3 C-17 

C-2D Scatterplot for Population A — sample size 4 C-18 

D-1 Cumulative distribution of estimated correlation 

for 44 states D-7 

D-2A Cumulative distribution of the estimated correlation. 

Population A D-8 

D-2B Cumulative distribution of the estimated correlation. 

Population B I>9 

I>2C Cumulative distribution of the estimated correlation. 

Population C D-10 

D-3A Cumvilative distribution of the nominal 95 percent lower 
confidence boimds of the pa3nxient error rate using (A) the 
estimated correlation and (B) the minimimi correlation rule 
in the regression estimate of variance, for Population A D-11 

D-3B Cumulative distribution of the nominal 95 percent lower 
confidence bounds of the payment error rate using (A) the 
estimated correlation and (B) the minimum correlation rule 
in the regression estimate of variance, for Population B D-12 

D-3C Cumulative distribution of the nominal 95 percent lower 
confidence bounds of the payment oror rate using (A) the 
estimated correlation and (B) the minimum correlation rule 
in the regression estimate of variance, for Population C I>13 

E-1 Scatter charts illustrating the relationship between the direct 
estimate of variance and the estimate based on the 
regression, for states, for periods 3 and 4 E-13 

E-2 Average tmit variance in FY 1981-82, 

for states arranged by that average E-14 

E-3 Ratio of the variance of the direct estimate of unit variance 
to the composite estimate of imit variance using zero and 
high squared bias, related to the size of the Federal 
subsample E-15 



vm 



Table of Contents 



VJestat, Inc. 



Figure Pa ge 

E-4 State weights for pooled unit variance estimates, for states 

sequenced by weight for the simple pooled estimate E-16 

E-5 Weights for the composite estimate using 

zero and high squared bias E-36 

E-6 Relationship of the simple pooled estimate and the 

composite estimate using zero squared bias E-36 

E-7 Relationship of various estimates of unit variance for 

1984 to the direct estimate for 1983 (x 10^) E-37 

G-1 Federal gain as a proportion of Federal 

payment share of $20,000,000 G-8 

G-2 Federal gain as a proportion of Federal 

payment share of $50,000,000 G-9 

G-3 Federal gain as a proportion of Federal 

payment share of $300,000,000 G-10 



IX 



Table of Contents 

-- 



CHAPTER 1. INTRODUCTION AND SUMMARY 



1.1 Introduction 

A formal quality control program was introduced in the early 1960's by 
the Social Seaarity Administration (SSA) to provide guidance in assessing sources of 
error in the administration of the Aid to Families with Dependent Children (AFDC) 
program in the various states. The Quality Control (QC) program required each 
state to institute a review of a sample of cases receiving benefits from AFDC, to 
carefully reinvestigate these cases and to evaluate the eligibility and amount of the 
payment made for each sample case, and to provide other information. The 
principal purpose of the QC review was to identify sources of error, to measure the 
magnitude of errors to the extent feasible, and to provide information that could 
guide in taking corrective action. The corrective action could be in the form of 
improving the administration of the system or of modifying legislation or 
regulations that were sotirces of problems. 

The state QC sample has been drawn and administered by each state 
within the framework of the Federal regulations that prescribe and guide the QC 
program. The program is complicated by the fact that each state has different 
eligibility requirements and allowances, and the QC administration in a state needs 
to reflect these differences. Sample sizes in the larger states have been about 1200 
cases to be reviewed in each successive six-month period, with smaller samples in 
the states with small caseloads.' 

A Federal subsample was drawn from the QC sample in each state to 
guide and facilitate the administration of QC. The eligibility and the AFDC 
allowance for the subsampled cases were again intensively reviewed and evaluated. 
This review provided a framework for improving the quality and comparability of 

^Optional smaller state sample sizes were recently authorized when QC was placed on an annual basis 
provided the state signed a statement waiving its right to challenge the validity of the error rate 
based on the reduced sample size. 



1-1 



Table of Contents 



Chapter 1. Introduction and Summary 



administration of the quality control programs although still taking account of the 
differences in state systems. 

Steps were taken in 1973 toward instituting a program of disallowances 
for states that did not meet a prescribed tolerance, by withholding the Federal share 
of AFDC payments that were made in error above the allowed tolerance level. This 
tolerance, which had been administratively established by the Department of Health 
and Human Services, was subsequently set aside by the Federal District Court as 
lacking an empirical basis. In 1980, the Congress established decreasing tolerances to 
be attained in fiscal years 1981 and 1982, with a tolerance of 4 percent in the 
overpayment error rate in 1983. The 4 percent tolerance was reiterated in the Tax 
Equity and Fiscal Responsibility Act (TEFRA) of 1982, which established a 3 percent 
tolerance for fiscal year 1984 and thereafter. Consequently, an important goal of the 
Federal QC sample review, in addition to providing guidance for improving 
program adnxinistration and design, became to provide estimates of overpayment 
error rates for the purpose of determining the amount of any disallowances. 

It would be possible to estimate the overpayment error rates directly 
from the results obtained from the Federal subsample, without making use of the 
available state QC results. However, Westat recommended a double sampling 
procedure for drawing the Federal subsamples from the state samples and for 
preparing estimates. This procedure produced considerably more precise results 
from a given size of Federal subsample, and was adopted. 

More specifically, a regression estimator was recommended by Westat, 
in memoranda dated June 18, 1973 and July 19, 1973.^ The regression estimator with 
double sampling makes use of the results available from the full quality control 
sample for a state, together with those for the Federal subsample. Its use, in practice, 
has generally had the effect of reducing the sampling variance of estimates of 
payment error rates by about 50 p>ercent or more, as compared with using only the 
Federal subsample. Stated in a different way, the double sampling plan and 
estimator generally yield results equivalent in precision to what would have been 

^Memorandum submitted by Morris Hansen, Westat, to John C Young, Social and Rehabilitation 
Service, DHEW, Review and evaluation of propoted u$e of QC tyttem for Federal estimates of 
ineligible and overpayment cases, June 18, 1973, and supplemental memorandum dated July 19, 1973. 



1-2 



Table of Contents 



Vie f tat. Inc. 



obtained by at least doubling the size of the Federal subsample and basing the 
estimate only on the Federal findings. However, if the Federal sample cases were 
not reviewed by the state in advance of the Federal review, the quality of the Federal 
review would be adversely affected, and its cost considerably increased, since in the 
present procedure the Federal reviewer has an easier job (and presumably does a 
better job) because s/he has the advantage of the previous state review. Thus, to 
maintain the same quality without the use of the double sampling estimator, not 
only would the Federal sample have to be increased by a factor of two to three, but 
the sample would also need to be reviewed by the state. For example, if the state 
sample size is now 1200 and the Federal sample 360 (giving a total of 1560 reviews), 
doubling the sample would mean state and Federal samples of 720 (giving a total of 
1440 reviews). This would reduce the cost of the QC reviews by only 8 percent 
(assuming about equal costs for the state and the Federal review). If the Federal 
sample size had to be somewhat more than doubled to get the same precision, as is 
likely, the cost would actually be increased. Even more important is the fact that 
reducing the size of the state sample in this manner would greatly reduce the 
effectiveness of the QC program in its primary goal, that of identifying causes of 
error and guiding appropriate corrective actions. 

It should be noted that the double sampling and regression estimation 
procedure does not "adjust" the state estimates — instead, it provides estimates of 
what would result if the Federal QC review, preceded by a state review, were applied 
to the entire caseload. It is simply a procedure for reducing the sampling error of the 
estimate from the Federal subsample. It makes use of the fact that the Federal and 
state findings on individual cases are highly correlated. Consequently, if the 
overpayment errors based on state findings for the cases in the Federal subsample 
are above those in the full state sample, then the Federal findings based on that 
sample are likely also to be too high. The regression estimator adjusts for the 
difference in average state findings in the two samples. A similar sampling error 
adjustment results if the state findings in the Federal sample are below the state 
findings in the full state sample. Thus, by use of the regression estimator, the 
effective sample size of the Federal subsample is increased substantially since there 
is a high correlation of case-by-case findings from the state and the Federal reviews. 
The estimate based on the Federal review in a state may or may not agree with the 
state estimate, depending on the amoimt of agreement between individual Federal 



1-3 



Table of Contents 



Chapter 1. Introduction and Summary 



and state case findings. Thus, the results from the regression estimator are estimates 
of what would be obtained if the state QC review, followed by the Federal review, 
were applied each month to all cases receiving AFDC. Of course, such a procedure 
would be prohibitively costly. 

As currently used in AFDC, the regression estimator of the overpay- 
ment error rate (referred to also as the payment error rate) for any given state is 

. X" x- +b(y -y) 

^ = — = (1) 

t t 

where 

n' 
x' = Z Xj/n' is the average overpayment error per case in the Federal 

subsample as determined by the Federal review (it is the average 
over all cases whether or not there was an overpayment error 
involved); 

n 
y = Z yi/n is the average overpayment error in the state CJC sample as 

determined by the state review; 

n' 
y ' = X Xi/i^' is the average overpayment error as determined by the state QC 

review for the cases included in the Federal subsample; 

n 

t = L tj/n is the average AFDC payment for the cases in the state (^C 

sample; 

n' 

I Xiyj - n'x 'y ' 

b = — (2) 



I (yi--y)2 



is the regression coefficient estimated from the Federal 
subsample; 



n is the size of the state QfZ sample; 

n ' is the size of the Federal subsample; 



1-4 



Table of Contents 

Weatat, Inc. 



Xj, Yi, and tj are, respectively, for the i-th case in the designated state or 

Federal sample, the amount of overpayment as determined in 
the Federal review, the amount of overpayment as determined 
in the state QC review, and the AFDC payment for the case; 

s. = s,{[i..K^)]/"r/t 

is the estimated standard error of R; 

Sy 

r = b — ~ is the coefficient of correlation of x; and y;, estimated from the 

Federal subsample; 

1/2 



Sx = {l(^-i) /(n'-l)} 



sy = (l(yry)/(n'-0} 



is the unit standard deviation of the payment errors as 
determined in the Federal review and as estimated from the 
Federal subsample; 

2 , . 1 1/2 



is the unit standard deviation estimated from the Federal 
subsample of payment errors as determined in the state review. 

The above and other formulas used (except as otherwise spedHed) 
assume simple random sampling of the state QC sample from the file of AFDC 
payment records, and of the Federal subsample from the state QC sample. In 
practice, in most states Uie samples are drawn by proportionate stratified systematic 
sampling procedures rather than simple random sampling. The stratification is by 
months, with the same fraction of cases sampled each month. The systematic 
selection within months ordinarily involves taking every k-th case from an ordered 
list with a random start and with the ordering likely to involve geographic or 
alphabetic sequencing, or both. Simple random sampling formulas are commonly 



1-5 



Table of Contents 



Chapter 1. Introduction and Summary 



applied in such situations, and in this application they should give quite good 
approximations.^ In a few states, other modes of stratification are sometimes used. 

In the original memoranda recommending the use of the regression 

estimator to estimate the overpayment error rate and its standcird error, T was used 

in the denominator instead of t, where T is the average payment per case for the 

total AFDC caseload for the period. It turned out that T was not reasonably available 
in practice, and t has been substituted. As indicated later in Appendix I, this 
substitution has been quite satisfactory. 

A question that has concerned us about these estimators is that the 
regression estimator and its estimated standard error are based on approximations 
that hold for large enough samples, but that may not be reasonably acceptable for 
samples of the sizes used for the Federal subsample in some or all of the states. The 
size of the Federal subsample for a six-month period has varied generally between 
about 70 and 200 cases for the various states, and thus between about 140 and 
400 cases for a full year. Ordinarily, samples of these sizes would not be coitsidered 
too small if the samples were drawn from populations that are not extremely 
skewed. However, the populations in this case are extremely skewed, with no 
payment errors found in about 80 to 90 percent of the cases, and with considerably 
varying and highly skewed payment errors occurring in the remaining 10 to 
20 percent of the cases. 

Because of this concern, in a later memorandum^ concerning the QC 
program in Supplemental Security Income (SSI), we recommended, on the basis of a 
preliminary evaluation, the substitution of a difference estimator for the regression 
estimator. The dlHerence estimator is of the same form as the regression estimator 
except that a constant, k, is substituted for b (b is estimated irom the sample and is 

^We have compared such stratified sampling with simple random samfding for the Food Stamp QC 
program, which is similar to the AFDC-OC program, and found remarkaldy dose agreement of results 
for the two procedures (i.e., simple random sampling and stratified proportionate sampling by 
months). 

'^Memorandum dated Septeml>er 30, 1981, submitted by Westat to Social Security Administration, 
Office of Payment Eligibility and Quality. 



1-6 



Table of Contents 



Westat, Inc. 



subject to sampling variability). The regression estimator is evaluated in Section 2.2, 
where it is shown to provide unbiased or at most trivially biased estimates. The 
difference estimator is evaluated for AFDC-QC in Appendix B, and compared with 
the regression estimator. This evaluation shows little difference between the two 
estimators and leads us to conclude that we see no advantages to AFTXI in changing 
to the difference estimator. 

Some of the states have argued that if disallowances are to be imposed 
they should not be computed on the basis of the point estimate, as now prescribed. 
They suggest that since the overpayment error rates are based on samples, a lower 
confidence bound should be used, e.g., a bound computed for the sample such that 
there is a low probability that the lower bound of the confidence interval computed 
for each of the possible samples is less than the true error rate, and a high probability 
that it is greater. 

Such an approach would, on the average, systematically and 
substantially underestimate the amount which would be disallowed if the true error 
rate were known. The state's gains would be the Federal government's loss. 
Moreover, the amount of the disallowances would depend importantly on the 
sample size (the disallowance for a state would be less for a given error rate, on the 
average, if a smaller QC sample size were used). Also, a problem arises because a 
state could lower the confidence bound by inadvertently or deliberately doing lower- 
qucility work in the state QC, thus increasing the sampling error of the regression 
estimate of the payment error rate. This is because a reduction in the quality of the 
state QC results would increase the number of discrepancies between the state and 
Federal evaluations. These increased discrepancies would decrease the correlation 
between the state and the Federal findings, and thus (as can be seen from 
Equation (3) above) would increase sg, the estimated standard error of the regression 
estimator. Since, for example, a 95 percent nominal lower confidence bound is 
computed by subtracting 1.645s^ from the estimated error rate, the result would be a 
lower average value for the computed lower confidence bound and, hence, a 
smaller disallowance. Consequently, there might be an incentive for a state to lower 
the quality of work, in order to avoid or reduce disallowances. 



1-7 



Table of Contents 



Chapter 1. Introduction and Summaiy 



We note (as discussed in Section 3.3 and in Appendix D) that a minor 
change in the standard procedure for computing lower confidence bounds would 
substantially eliminate this problem. This procedure involves assigning a 
minimum value for the correlation of Federal and state findings (a minimum rho) 
in estimating the variance. 

While more reseaurch is desirable, we have made enough progress that 
some guidance is provided in this report on the first two of the following important 
questions that you have asked us to examine. These questions include the 
following: 

• Are the sampling procedures and the regression methodology 
used by the AFDC-QC statistically valid? 

• What are the considerations and constraints involved in the 
choice of a lower confidence bound versus a point estimate in 
determining disallov- ances? 

• What are the considerations and constraints in the choice of 
sample size for the state quality control samples and for the 
Federal review samples? 

• Are there any means of decreasing the sampling errors (and 
reducing the width of confidence intervals) of estimated state 
error rates other than by increasing sample size? 

In the following sections of this report, we provide some answo-s to 
the first two of these questions in as nontechnical langiiage as feasible, on the basis 
of the work that has been completed. Fuller technical analyses and more detailed 
considerations of some of the issues and the implications of alternatives are 
included in the relevant appendices. Some very limited preliminary attention is 
given in this report to the last two questions. They will be more fully considered in 
a second report. 

Before proceeding to the more detailed discussion, we provide a 
summary of the principal conclusions from the work that has been done. 



1-8 



Table of Contents 



Westat, Inc. 



1.2 Some Summary Results and Conclusions 

On the basis of the evaluation work that has been completed, we are 
able to summarize the results and conclusions as follows: 

(1) The procedures specified for drawing the state samples and the 
Federal subsamples are applications of standard and widely used sampling methods, 
and if the samples are made large enough, they will yield estimates of overpayment 
error rates as close as desired to the value being estimated. The value being 
estimated is defined as the expected value that would be obtained if the entire 
caseload were reviewed by both state and Federal reviewers (as is done for the 
Federal subsample). 

(2) The regression methodology for making estimates from the 
samples provides statistically valid estimates, unbiased in the sense that, on the 
average over all possible samples that could be drawn by the specified procedures for 
a state, the regression estimate of the overpayment error rate is equcil or very nearly 
equal to the value being estimated. This statement holds for each of the differing 
sample sizes in use in the various states. Moreover, as sample size increases, the 
sampling errors of the regression estimates decrease, and consequently the estimates 
are closer, on the average, to the value being estimated. 

(3) The sample estimates of the variance of the estimates of 
overpayment error rates are also, on the average, reasonably dose to the variance 
over all possible samples, and the computed sampling errors or confidence intervals 
provide, on the average, acceptable measures of precision. However, the sampling 
errors of the direct state variance estimates are so large that the use of the estimated 
variance from a single state sample for purposes of estimating needed sample sizes 
to achieve specified levels of precision, or to provide general measures of precision, 
can yield exceedingly variable and misleading results. In Section 2.5 a pooled 
variance estimation procedure is developed and presented that greatly improves the 
variance estimates for such uses. 

(4) Classical regression analysis requires the assumption of a linear 
relationship between the dependent and the independent vairiables, and normal 

1-9 



Table of Contents 



Chapter 1, Introduction and Summary 



distributions of the dependent variable for given values of the independent 
variable(s). The use of the regression estimator in estimating AFDC overpayment 
errors has been widely challenged on the grounds that the assumptions of classical 
regression are grossly violated. However, these challenges do not recognize the 
difference between classical regression analysis and the application of the regression 
estimator in sample surveys, as in AFDC-QC. For such applications, the 
assumptions are not required. Mathematical proof of the validity of the application 
of the regression estimator in sample surveys with sufficiently large samples, 
independent of the distribution from which the samples are drawn, is given by 
Cochran in a classical paper on regression estimation in sample surveys. ' In 
addition to that proof, we provide a number of examples involving different AFDC- 
QC populations and sample sizes illustrating the fact that the application of the 
regression estimator in AFDC for sample sizes similar to the sample sizes in use 
does yield valid results, as described in points (1) through (3) above (see Section 2.2 
and Appendix B). These illustrative results are provided for each of four sample 
sizes for each of three illustrative test populations based on actual AFEXZ data. 

(5) We also note that in the application of the regression estimator 
to AFDC, the regressions involved are of sample means rather than of the original 
observations and the relationships between the sample means are indeed closely 
linear. Also, while the conditional distributions of the dependent variable for any 
given value of the independent variable are slightly skewed, they are reasonably 
close to normal (see Section 2.2). Consequently, although meeting the classical 
assim\ptions is not necessary, they are in fact reasonably met in the application of 
the regression estimator in AFDC (Quality Control. 

(6) The distributions of individual case overpayment errors are 
highly skewed. Consequently, the nominal 95 percent confidence intervals which 
are now computed from the samples on the assumption of normal distributions are 
imperfect. If the distributions of overpayment errors were normal, then, on the 
average in rep}eated samples, for the sample sizes in use, dose to 2-1/2 percent of the 
time the value being estimated would be below the computed 95 percent cortfidence 

^Cochran, W.G., Sampling Theory When the Sampling Unit$ are of Unequal Site, loumal of the 
American Statistical Assodarion. Vol. 37, pp. 199-212, 1942. 



1-10 



Table of Contents 

■••- 

Westat, Inc. 



interval and dose to 2-1/2 percent of the tinne it would be above. In fact, the "tails" 
above and below the confidence intervals of the overpayment error rate estimates 
depart considerably from these expectations. For considerably less than 2-1/2 percent 
of the samples the lower confidence bound is above the value being estimated, and 
for considerably more than 2-1 /2 percent of the samples the upper confidence bound 
is below the value being estimated. The combined effect is that confidence intervals 
cover the values being estimated with somewhat less than the nominal 95 percent 
probability. Thus, the precision actually achieved is somewhat less than would be 
the case if the 95 percent confidence were actually achieved. Nevertheless, the 
95 percent (or 90 percent) confidence intervals provide reasonably satisfactory 
indicators of precision. It is important to note that the estimates of overpayment 
error rates are unaffected by any imperfections in the computed confidence 
intervals. 

(7) We have developed and have done some testing of an 
improved method for computing confidence intervals that will yield considerably 
closer approximations to the nominal probabilities. The results appear in 
Section 2.4 and in Appendix C. 

(8) The decision on whether to use point estimates or lower 
confidence bounds in determining disallowances is a policy one, and depends on the 
goals to be served. There are precedents for both approaches, as discussed in (12) 
through (13) below. 

(9) If the goal is to approximate the true disallowance, i.e., the 
disallowance that would be made if the true overpayment error rate were known, 
the point estimate satisfies the goal. Business organizations use sampling with 
point estimates to settle the sharing of large costs or benefits, as in the distribution of 
funds from jointly furnished services (for example, the distribution of funds by the 
railroads from shipments that go over two or more lines), or as in the sharing of 
joint costs (for example, joint maintenance costs of poles used to carry both 
telephone and elecfric cables). Similarly, sample surveys with point estimates are 
widely used in establishing rate bases for utilities (for example, to estimate 
replacement cost of plant and equipment from inspections of samples of such 
equipment) and in many other applications. Such applications of samples and the 



1-11 



Table of Contents 



1^. 



Chapter 1. Introduction and Summaiy 



point estimate generally call for samples large enough to yield reasonably precise 
estimates. 

(10) Computation of annual disallowances from QC samples are 
commonly subject to relatively large sampling errors, especially if payment error 
rates are less than about 4 percentage points above tolerance. Sampling errors of 
disallowances can be as much as 50 to 100 percent or more of a single year's 
disallowance. This problem could be substantially eliminated by making some 
modifications in the way disallowances are administered, so as to take fuller 
advantage of compensations over time (see Section 3.7). 

(11) If the goal is to assess disallowances separately for each year and 
then only to the extent that they have been reasonably proved to be at least a 
specified amount or more, then a lower confidence bound satisfies the goal. It is 
common in auditing, for example, to follow up leads of evidence of possible fraud 
from sample audits only if a lower confidence bound of an estimate is exceeded.* 

(12) Use of the lower confidence boimd would, on the average, result 
in AFDC diszdlowances that are much less than they would be if the true 
overpayment error rates were known and used in computing disallowances. The 
Federal government would absorb the loss, and this loss would be substantial. 
Consequently, if lower confidence bounds were to be adopted for computing 
disallowances, cost-benefit considerations indicate that, for states in which large 
disallowances are involved, it would be advantageous to the Federal government to 
use considerably larger samples than those now used (see Sections 3.4 and 3.5). 
Increases in state samples may also be called for. 

(13) The determination of appropriate sample sizes for QC for 
purposes of evaluating and guiding improvements in the AFDC program involves 
difficult issues, and there are no simple answers. Some limited preliminary 
discussion of these issues appears in Chapter 3. 



*See, for example, Arkin, Herbert, Sampling Methodt for Auditort, McGraw-Hill Book Company, New 
York, pp. 56-58, 107-109. 



1-12 



Table of Contents 



Weatat, Inc. 



(14) We see no obvious striking gains to be achieved by 
modifications in the design of the QC samples other than by increasing the state or 
Federal sample sizes. However, some gains may be feasible. Our explorations to 
date in this area are quite limited, and further work is needed in order to evaluate 
any such potential gains. 

(15) We add a final remark on a topic that we believe should be 
mentioned here. It has sometimes been suggested that the primary role of the QC 
samples should be to determine disallowances, and that corrective action inferences 
could better be guided by other special analyses and studies. Such a separation seems 
to be unnecessarily costly and undesirable. We anticipate that it may be possible to 
increase the effectiveness of the QC sample by subjecting the data to discriminant 
analyses, cluster analyses, or other methods of error-prone profiling, and thereby 
identify subclasses that contribute a high proportion of errors. Such studies could 
lead to the introduction of more effective stratification and more efficient allocation 
of the samples. The next phase of our study will indude examining such methods 
for improving precision without increasing sample size. Thus, if error-prone 
profiling proves to be effective, it could also help provide the much-needed 
improvements in precision of the QC sample when used for assessing 
disallowances. At the same time, it would also increase its effectiveness for analyses 
of sources of error and feedback for corrective action, and may also prove to be an 
effective tool for improving case reviews in administration. To separate the two 
uses would only add to cost and decrease performance. 

We note also that other sources of data such as income tax matching, 
wage matching, or bank matching have been suggested as an alternative to quality 
control reviews. Such data can be very useful, to the extent that their use is cost 
effective, in improving the administration of AFDC. Evaluation and possible 
extension of such uses are part of the current program of the Office of Family 
Assistance (OFA). These procedures do not replace the need for QC, but to the extent 
that they lower error rates, they may reduce the need for corrective action and may 
also reduce disallowances. After sufficient reduction in error rates has been 
accomplished in a state, then a reduction in the size of the QC sample would be 
appropriate in that state ~ but the sample must still be large enough to monitor for 
early detection of a serious deterioration of quality. 

1-13 



Table of Contents 



CHAPTER 2. STATISTICAL VAUDITY OF AFDC-QC METHODOLOGY 



We first address the question: 

• Are the sampling procedures and the regression 
methodology used by the AFDC-QC statistically valid? 

We have examined the specified sample selection and estimation 
procedures, and have reviewed existing theory and in some cases extended the 
theory. The available theory is not exact but holds for large enough samples. 
However, available statistical theory does not tell us what size samples are large 
enough; that is, what size samples are needed to achieve sufficiently close 
approximations. Consequently, we have done extensive simulations by drawing 
large numbers of independent samples from three test populations and prepared 
estimates from them for alternative sample sizes for each of the populations. The 
test populations, described in Appendix A, are samples of actual AFDC-QC cases. 
Many of our conclusions are based on the results of these simulations. 

In the balance of this report, we discuss more fully and illustrate the 
basis for most of the summary remarks that appear at the end of Chapter 1, and 
provide some extensions of them. 



2.1 Test Populations 

To examine the accuracy of the approximations, we have done 
extensive testing with three test populations (referred to as Populations A, B, and C) 
using actual AFDC-QC data from the Federal subsamples for the year ending 
September 30, 1982. 

Population A was created by taking the state and Federal QC results for 
the cases included in the Federal subsample for Illinois, New Jersey, Ohio, and 
Pennsylvania. These were four large states that had roughly similar average 
pajnnents for AFDC and roughly sin\ilar average overpayment error rates. 



2-1 



Table of Contents 



Chapter 2. Statiatical Validity of AFDC-QC Methodology 



Population B used the state and Federal QC results for cases included in 
the Federal san\ple for Texas, South Carolina, Maryland, and Michigan. These are 
relatively large states with somewhat different characteristics fronn those of 
Population A. 

Population C used the state and Federal QC results for cases included in 
the Federal subsample for six states with relatively smaller AFDC-QC sample sizes, 
including Arkansas, Colorado, Hawaii, Nebraska, Oregon, and West Virginia. 

Some of the characteristics of the three test populations and of the 
AFDC results for all states for the six-month period ending September 1982 are 
summarized in Table 2-1 and more fully in Appendix A. 

Various tests were carried through by drawing 1000 independent 
samples of each of a number of specified sample sizes from these test populations, 
and computing and evaluating various estimates from these samples. Among the 
sample sizes used in evaluating the regression methodology were the following: 





Annual sample size 




1 


2 


3 


4 


Size of state sample, n 

Size of Federal subsample, n' 


2400 
360 


1200 
360 


880 

7(0 


350 
160 



Each of the state samples was obtained by drawing with replacement 
from the population a simple random sample of the specified size, and then 
drawing a simple random sample without replacement from the state sample for 
the Federal subsample. Drawing the state sample with replacement has the effect of 
making the simulation process equivalent to drawing the sample from a much 
larger population, and in e^ect, simulates the drawing of the state sample from a 
very large state AFDC population equivalent in composition to the test population. 



2-2 



Table of Contents 



Westat, Inc. 



Table 2-1. Some characteristics of the test populations and of the full AFDC pxjpulation (1982) 





Units 


Test Population 


Average U.S. 




A 


B 


C 


6 months ending 
September 1982 


Average AFEXZ payment (T) 


dollars 


296 


210 


255 


302 


Standard deviation of payments 


ft 


255 


121 


194 


n/a 


Overpayments 

Average based on Federal review 
Average based on state QC review 


i» 


21.6 
17.2 


15.0 
16.7 


16.9 
13.7 


20. 
n/a 


Unit standard deviation of overpayments 
Federal review 


*i 


70.5 


58.6 


66.1 


n/a 


Correlation of Federal and state 
overpayments 


— 


0.83 


0.94 


0.81 


0.85» 


Overpayment rate {Federal review) 


percent 


7.30 


7.95 


6.62 


6.64 


Percent with overpayments 
(Federal review) 


percent 


12.7 


13.1 


11.2 


15.2 



n/a - Not readily available. 



*Sunple mean of the estimates for the 45 states that did not treat their samples as stratified samples for the state 
QC during this period (the mean was roughly the same for the remaining sutes). 



2-3 



Table of Contents 



■^ 



Chapter 2. Statistical Validity of AFDC-QC Methodology 



Table 2-2 shows state and Federal AFDC-QC sample sizes by state, for 
the year ending September 30, 1982. Sample sizes 1 and 2 above correspond 
approximately to and are illustrative of the sample sizes used in about 24 of the 
larger states. Sample sizes 3 and 4 are illustrative of samples used in a number of 
medium-sized and smaller states. 



2.2 Evaluation of the Regression Estimator 

Classical regression analysis is based on the assumption of a linear 
relationship between the dependent and the independent variables, and on the 
assumption that the dependent variable is approximately normally distributed for 
each value of the independent variable. However, as we have noted in Section 1.2, 
the fact that the joint distribution of individual state and Federal case findings of 
payment errors fails to satisfy these assvmxptions is not relevant for the choice of an 
estimator. As can be seen from Equation (1), (Section 1.1), the regression estimator 
depends, not on the relationship of state and Federal findings of error for the 
individual cases, but on the relationship of the sample means of those findings in 
the Federal subsample. Based on 1000 independent samples from each test 
population for each of four sample sizes, it is clear that the relationship between the 
means is closely linear. Figure 2-1 shows scatter diagrams of the relation of the 
sample mean of Federal findings and the sample mean of state Bndings for the same 
sample, for 1000 samples drawn from Test Population A for each of four different 
sample sizes. ^ It is clear from the diagrams that there is little if any departure from a 
linear relationship. Also, the distributions of the points about the fitted lines are 
approximately although not quite normal. Thus, the assumptions of classical 
regression analysis are fairly well satisfied. We emphasize again, however, that 
although the classical assumptions appear to be reasonably well satisfied, meeting 
them is not required in order to assure the validity of the regression estimator. 
Rather, that validity requires only that the variances and covariance involved are 
finite, and that the sample is suffldently large (see Cochran, op. dt., p. 203, and see 
also Appendix B). Since the first of these conditions is obviously satisfied when 
sampling from a finite population such as the AFDC case determinations, it remains 
only to ask if the samples used in AFDC-QC are large enough. It is for this purpose 
that we examine the results of sampling from test populations made up of real data, 
using sample sizes that approximate those in actual use. 

Similar diagrams for two other test populations are included in Appendix B. 



2-4 



Table of Contents 



Westat, Inc. 



Table 2-2. Sample sizes by state for 12-nnonth period ending September 30, 1982. [Samples are treated 
as stratified Scmiples in some states, with stratum figures showrn in parentheses ( ).] 





State 


Federal 




State 


Federal 




sample 


sample 




sample 


sample 


State 


n 


n' 


State 


n 


n' 


Alabama 


2,211 


377 


Michigan 


2,396 


361 


Alaska 


314 


134 


Mississippi 


1,995 


365 


01 


(225) 


(96) 


Minnesota 


1,718 


311 


02 


(89) 


(39) 


Missouri 


2,580 


389 


Arizona 


748 


229 


Montana 


330 


156 


Arkansas 


1,070 


301 


Nebraska 


424 


183 


California 


2,432 


366 


Nevada 


329 


152 


Colorado 


908 


274 


New Hampshire 


295 


137 


06 


(129) 


(40) 


New Jersey 


2,358 


362 


07 


(655) 


(193) 


New Mexico 


636 


208 


61 


(33) 


(8) 


New York 


2,483 


364 


62 


(91) 


(33) 


North Carolina 


2,422 


3oo 


Connecticut 


1,733 


356 


North Dakota 


346 


160 


Delaware 


304 


167 


Ohio 


2,491 


386 


District of Columbia 


938 


266 


Oklahoma 


1,409 


298 


Florida 


2,534 


394 


Oregon 


1,174 


285 


Georgia 


2,445 


376 


Pennsylvania 


2,466 


375 


Hawaii 


605 


210 


Rhode Island 


625 


211 


Idaho 


334 


129 


South Carolina 


2,431 


376 


Illinois 


2,381 


358 


01 


(1,221) 


(175) 


01 


(339) 


(47) 


02 


(1,210) 


(201) 


02 


(1,478) 


(223) 


South Dakota 


326 


151 


03 


(564) 


(88) 


Tenneaaee 


2,157 


359 


Indiana 


2,063 


364 


Texas 


2399 


374 


Iowa 


U08 


304 


Utah 


323 


172 


Kansas 


776 


242 


Vermont 


301 


156 


Kentucky 


2,137 


364 


Virginia 


2,330 


358 


Louisiana 


2,421 


382 


Washington 


1,942 


341 


Maine 


631 


218 


West Virginia 


971 


273 


Maryland 


2,425 


365 


Wisconsin* 


2,508 


394 


Massachusetts 


2,401 


354 


01 


(1,704) 


(266) 


00 


(1193) 


(175) 


02 


(804) 


(128) 


01 


(594) 


(92) 


Wyoming 


339 


168 


02 


(614) 


(87) 









'Figures quoted are twice those for the last 6 months of the year. 



2-5 



Table of Contents 



Chapter 2. Statistical Validity of AFDC-QC Methodolog y 



Figure 2-1. Mean findings of dollar error per case in 1000 independent samples for each of four 
sample sizes. Population A 



n=2400, n'=360 



n=1200,n'=360 




STATE FINDXNi 



2a 

STATE FINDING 



40 



30 



ae 



10 



B*- 



n=880,n*-260 




/. »\' 



10 



20 

STATE FINDINB 



40 



30 



2 

u. 



10 



40 



0*- 





n»350,n'-160 


• 




• • 


• • 


• • 


• < 


.•*'.«^^ 


• • » 


il/$-': 


•••->g5 


5K*> 


•yld££ 


k *. • 



■>r^ 






.^iT^ 



10 



STATE FINDINS 



2-6 



■&. 



Table of Contents 



Westat, Inc. 



Some of the results based on replicate samples drawn from 
Population A are summarized in Table 2-3. Similar results were obtained for the 
other test populations and are presented in Appendix B. These results indicate that 
for the various sample sizes in use the regression methodology provides valid 
estimates of overpayment error rates for the various sizes of annual state and 
Federal samples in use. By valid estimates, we mean that for a given sample size 
the average of the estimates over a large number of samples is dose to the value 
being estimated, and that the computed sampling errors or confidence intervals 
provide approximate but acceptable indicators of precision. 

Illustrations are provided by comparing lines 1 and 2 of Table 2-3 and 
also by comparing the differences between these (line 3) with their estimated 
standard errors (line 4). For each sample size, the average of the overpayment error 
rate estimates is closely equal to the overpayment error rate in the test population. 
Similar results are seen from the additional comparisons available in Table B-3 of 
Appendix B. While the estimates are almost all less than the population values, the 
differences are all far less than their sampling errors. AU such differences contribute 
less than one percent to the estimated mean square errors of R, We conclude that 
here is a trivial negative bias in the regression estimator. Any such bias decreases 
faster than the sampling error decreases as sample size is increased. 

Table 2-3 also illustrates that, with the regression methodology applied 
to Test Population A, the estimated variances of R (line 6) are all reasonably dose to 
the estimated true variances (line 5). The differences are all small relative to their 
estimated standard errors. Again, similar results are seen in Table B-3 of 
Appendix B for Test Populations B and C. 

2.3 Evaluation of Computed Confidence Intervals 

Another way to examine the validity of the regression methodology is 
to determine, for example, the proportion of times in repeated sampling that the 
computed nominal 95 percent or 90 percent (two-tailed) confidence intervals 
include the true payment error rate, and the proportion of times that the true 
payment error rates are above or below the specified nominal confidence bounds. 
Such results are shown in Table 2-4. 



2-7 



Table of Contents 



Chapter 2. Statistical Validity of AFDC-QC Methodology 



Table 2-3. Evaluation of regression estimator based on computations for 1000 independent samples 
drawn from Test Population A 





Sample size 


(n and n") 






1 


2 


3 


4 




2400 


1200 


880 


350 


Statistic 


360 


360 


260 


160 


1 . True overpayment error rate in test population 


.0730 


.0730 


.0730 


.0730 


2. Average of estimated overpayment error rates 










from 1000 samples (R = I R^^/IOOO) 


.0731 


.0724 


.0727 


.0729 


3. Difference (Line 1 - Line 2) 


-.0001 


.0006 


.0003 


.0001 


4. Estimated standard error of difference 










(standard deviation of Rj^ from 10(X) samples) 


.00025 


.00027 


.00033 


.00048 


A 

5. Estimated true variance of R based on 










variaiKe of R from 1000 samples 










4 ' [2 (Rk-5)2/lOOO] (xlO*) 


.628 


.704 


1.073 


2.29 


A 

6. Average of estimated variances of R from 










each of 1000 sam{^es 










av(s^ ) = [isA /lOOO] (xlO*) 


.645 


.799 


1.100 


2.19 


7. Difference (Line 5 - Line 6) 


-.017 
.031 


-.095 
.109 


-.027 
.053 


.10 
.113 




9. Standard error of estimated variances of R 










2 (s^^ - av(4j^))^ /lOOOp/^ (^io4) 


.22 


.23 


.39 


.87 



*Computed from a^^^ « ^gg {(Standard error of estimated variance otRf+(d''f (P-1)} 

with P assigned the value 3.3. EssentiaUy the same results would have been obtained for P assigned values from 3 
to 4, which seem reasonable from Figure C-1 in Appendix C Direct estimates of P varied between 2.8 and 3.2. 
The value 3.3 was taken as an approximation before the direct estimates were available, and was so close that it 
was not worth recomputing. 



2-8 



Table of Contents 



Vfestat, Inc. 



Table 2-4. Proportion of observed samples in which value being estimated was above, below, or 
covered by specified nominal confidence bounds, for Test Popiilations A, B, and C 





Test 




Samp 


le sizes 














Nominal confidence boimd 


Population 


2400/360 


1200/360 


880/260 


350/160 


Below .025 p>oint 


A 
B 


.011 
.011 


.006 
.012 


.010 
.008 


.013 
.017 




C 


.003 


.011 


.009 


.007 




Average 


.006 


.010 


.009 


.012 


Below .05 point 


A 
B 


.024 
.032 


.028 
.030 


.028 
.033 


.031 
.036 




C 


.014 


.021 


.020 


.028 




Average 


.023 


.026 


.027 


.032 


Above .95 point 


A 

B 


.084 
.093 


.097 
.072 


.100 
.093 


.102 
.096 




C 


.093 


.103 


.113 


.120 




Average 


.090 


.091 


.102 


.106 


Above .975 point 


A 
B 


.053 
.067 


.059 
.042 


.066 
.055 


.075 
.062 




C 


.060 


.080 


.084 


.087 




Average 


.060 


.060 


.068 


.075 


Between .05 and .95 points 


A 

B 


.892 
^75 


.875 
^98 


.872 
.874 


.867 
.868 




C 


.893 


.876 


.867 


.852 




Average 


.887 


.883 


.871 


.862 


Between .025 and .975 points 


A 
B 


.936 
.922 


.935 
.946 


.924 
.937 


.912 
.921 




C 


.937 


.909 


.907 


.906 




Average 


.932 


.930 


.923 


.913 



'Based on 1000 independent replicate samples for each sample size for each test populaticm. 



2-9 



Table of Contents 



■^ 



Chapter 2. Statistical Validity of AFDC-QC Methodology 



The nominal 95 percent confidence intervals (and other confidence 
intervals) as now computed for AFDC-QC make use of normal distribution theory, 
i.e., assume that the distribution of the estimated payment error rate and its esti- 
mated standard error are distributed approximately as they would be for an esti- 
mated mean based on simple random samples of about 30 or more observations 
drawn from a normal distribution. Thus, the 95 percent confidence intervals are 
computed for the overpayment error rate, R, by computing R±1.96s^, where s^ is 

the estimate from the sample of the standard error of R. For large enough samples 
drawn from the AFDC population of overpayment errors, the probability that such a 
confidence interval will cover the true value will be reasonably close to the nominal 
95 percent. We refer to this as the nominal probability. If the overpayment errors 
were normally distributed, then, on the average, approximately 95 percent of such 
confidence intervals would include the value being estimated, and in about 2- 
1/2 percent of the samples the lower bovmd would be below the value being esti- 
mated, and in about 2-1 /2 percent of the samples the upper bound would be above. 

In AFDC-QC, as illustrated in Table 2-5 for Test Population A, the 
distribution of overpayment errors is a very skewed rather than a normal distribu- 
tion. Also, AFDC-QC uses a double sample and a regression estimator. To help 
evaluate the usefulness of the computed confidence intervals under these circum- 
stances, we have examined how dose the observed probabilities are to the nominal 
probabilities. We have done this by taking repeated independent samples from each 
of the three test populations described in Section Zl and more fully in Appendix A. 

From Table 2-4, it is seen that for each test population and, on the 
average over the three test populations, the fractions for which the true value was 
below the nominal 95 percent two-tailed confidence intervals is considerably less 
than the 2-1 /2 percent that would be expected if the samples were drawn from 
normal distributions. Conversely, R was above the computed confidence intervals 
in a considerably higher fraction than the nominal 2-1/2 percent. More specifically, 
on the average for the three test populations, for each sample size the value being 
estimated falls below the lower nominal 95 percent confidence bound for only about 
1 percent of the samples, and in about 6 to 7 percent of the cases it falls above the 
upper nominal confidence bound. The differences between these percentages and 
the 2-1/2 percent nonunal percentage cannot be explained by sampling variability. 



2-10 



Table of Contents 



Westat, inc. 



Table 2-5. Percentage distribution of overpayment errors as determined by the state and Federal 
evaluation for Test Population A (Note that in this table, as in the analyses, under- 
payment errors are treated as zero overpayment errors.) 









Overpayment errors 


($) per Federal QC 






Overpa)mient 

errors ($) 
per state QC 


None 


1-99 


100-199 


200-299 


300-399 


400-499 


500-599 


Total 


None 


86.7 


0.5 


0.6 


0.6 


0.3 


02 


0.1 


89.0 


1-99 


0.4 


4.4 


-- 


0.1 


— 


-- 


-- 


5.0 


100-199 


0.1 


— 


2.0 


— 


— 


— 


-- 


2.1 


200-299 


0.1 


— 


-- 


2.2 


" 


— 


— 


2.2 


300-399 


0.1 


— 


-- 


— 


1.5 


— 


— 


1.6 


400-499 


— 


— 


— 


" 


— 


0.1 


-- 


01 


Total 


87.3 


4.9 


2.6 


2.9 


1.8 


0.3 


0.1 





2-11 



Table of Contents 



Chapter 2. StaHaHcal Validity of AFDC-QC Methodology 



Table 2-4 also shows that for the largest sample size (n=2400, n'=360) 
the coverage of the computed (two-tailed) 95 percent nominal confidence intervals 
for the test populations falls short but conforms approximately to expectations. 
More specifically, on the average for the three test populations, 93.2 percent of this 
particxUar set of 3000 repeated samples (1000 for each test population), the 95 percent 
nominal confidence intervals include the value being estimated. Such estimates 
are, of course, subject to sampling errors. For the next sample size (n=1200, n'=360), 
the observed average proportion of the 95 percent nonunal confidence intervals that 
include the value being estimated is similar but slightly lower, being about 
93 percent. For the two smaller sample sizes (n=880, n'=260 and n=350, n'=160) the 
proportions are about 92 percent and 91 percent, respectively. While these are 
statistically significant departures from expectation for normal distributions, the 
results are nevertheless close enough that the computed confidence intervals can be 
interpreted as providing useful measures of the precision of estimated error rates, 
with the observed probabilities being somewhat less than but reasonably close to 
expectation. They tend to be closer to the nominal probabilities for the larger sample 
sizes. However, from Table 2-4 it is seen that for the lower tails (below the 2-1/2 
percent and 5 percent nominal bounds), or for the upper tails (above the 95 percent 
and 97-1/2 percent nominal boimds), the probabilities do not tend to be closer to the 
nominal probabilities for the larger samples. We presmne this is because the 
subsampling ratio n'/n is lower for the larger sample sizes, and especially for the 
largest sample size used in the analyses. 

As seen from Hgure 2-2, for the sample sizes in use, the distributions of 
the estimated overpayment error rates appear to be reasonably dose to normal, 
although still moderately skewed. As discussed in Appendix C, the departure from 
expected proportions in each of the two tails of the confidence intervals arises 
because the distributions of payment errors are considerably skewed, resulting in a 
positive correlation of the estimated standard deviations with the estimated 
overpayment error rates, and especially because of the wide variability in the 
estimated standard deviations. As a result, the computed upp>er and lower nominal 
95 percent confidence bounds are both somewhat low. 



2-12 



Table of Contents 



Westat, Inc. 



Figure 2-2. Distribution of estimated payment error rate (based on 1000 samples from Population A) 



.30 



.aa 



2 

Q 



.15 



.10 



.05 













r 


































































































1 






























































































































I 
















































■ 


























































































rNl 










! 




























i 
















A: 
B: 
C: 
D: 


n=2400 n'=360 
n=1200 n'=360 
n» 880 n'»260 
n= 350 n'=160 


! 
















1 


















D\i\\l 


( 














1 \ 


\ 


rni 






























\ 












f. 










































1 












1 
















t 












III 




























III 






1 * 
1 I 


L 




















'I 






n 


L 


















/ 


\ 








u 


















/, 


V 








K 


y 














i 


o 


' 








\\s 


^ 














/ 


..J 


/ 








\s 


b^ 


^Z 







.01 .OB .£Q .04 .(as .CS .07 .GS .09 .10 .11 .12 .13 .14 

PAYMENT ERROR RATE 



2-13 



Table of Contents 



Chapter 2. Statiatical Validity of AFDC-QC Methodology 



2.4 An Improved Procedure for Computing Confidence Bounds 

The results summarized in Section 2.3 above are for confidence 
intervals as they are now computed. We have explored several alternatives for 
computing confidence intervals and describe here an alternative method that 
involves the use of "Jackknife replicates. "^ The greater the number of Jackknife 
replicates used, the greater is the precision of the variance estimates, but also the 
greater the computation costs. Often, in practice, a compromise choice is made and 
from 30 to 60 replicates are frequently used. 

One way that K Jackknife replicates can be formed, after selection of the 
state and Federal samples for a state, is by first dividing the state sample into K 
random subsets of equal or nearly equal size (each subset would be a stratified 
random subsample if the original sample was stratified). A Jackkiufe replicate is 
then formed by dropping one of the random subsets from the total sample and 
retaining in the replicate all of the remaining cases. A total of K overlapping repli- 
cate samples is formed by repeating this for each of the K subsets. The Federal find- 
ings are used for the cases in a replicate that are members of the Federal subsample. 

The regression estimate of the overpayment error rate is made 
separately for each replicate as well as for the total sample. Then an estimate of the 
variance of the overpayment error rate for the whole sample is obtained by 
computing 



K 2 

R K -^ 



2 K-1 

s 



= -^X((^k-R) 



where Rk is the estimated overpayment error rate for the k-th Jackknife replicate, 
and R is the estimate for the whole sample. 

^The term "Jackknife" was suggested by John Tukey, a leading statistician, who noted that the method 
might be used to estimate variances of complex statistics. He noted that the use of Jackknife 
replicates provides a simple and af>proximate method for making variance estimates from samples 
even for complex estimators such as the double sampling regression estimator. He observed that the 
procedure was a simple but often effective tool, something like using a jackknife as a general-purpose 
tool. 



2-14 



Table of Contents 



1^. 



Weatat, Inc. 



Another way to form Jackknife replicates starts by defining 2K subsets 
of the state sample and arranging them into K pairs. The pairs would be random 
divisions of first-stage sampling units, within strata if the original sample is 
stratified, or stratified samples within groups of strata of about equal aggregate size. 
A Jackknife replicate then uses the data in all pairs except one. In that pair, one of 
the subsets chosen randomly is doubled and the other is omitted. This gives K 
replicates. Again, the regression estimate is made for each of the replicates. The 
estimate of the variance is then given by 

K 2 

^l = Z(Rk-R) 

^ k 

A 

where R^ is the estimated overpayment error rate for the k-th replicate. 

With either of the above approaches, confidence bounds can be 
computed as R ± t s^ . With 30 or more replicates, the ordinarily used values of t are 

t=1.96 for a 95 percent coi\fidence interval and t=1.645 for a 90 percent confidence 
interval. (If the samples were drawn from normal distributions, these would be 
appropriate values for t.) 

However, in order to reduce the effect of skewness in the distribution 
of estimated payment error rates, we describe a nuxlification of the above procedure. 
The modification is to transform the overpayment error rates for each of the K 
Jackkiufe replicates and for the total sample by a logarithmic transformation. Such a 
transformation reduces the skewness of the distribution. If we denote 



Zk 



z 



= log^ 
= logi 



then. 



K 

2 K-1 V r ^2 



if the first described method of fonning replicates is used, and 



2-15 



Table of Contents 



Chapter 2. StaUatical Validity of AFDC-QC Methodology 



K 



Sz = Z (zk-z)^ 
if the second method is used. 



The lower and upper 95 percent confidence bounds for z are 
zl = z - 1.96sz and zy = z + 1.96sz. 

The lower and upper confidence bounds for R are then Rl = antilog zl 
/\ 
and Ru = antilog Zy- 

We have made some tests of this procedure for computing confidence 
bounds, using 400 repeated independent samples from Population A, for each of 
four sample sizes used earlier, and for an additional 1500 independent replicates for 
the largest sample size (n=2400, n'=360) and for an additional 2000 replicates for the 
smallest sample size (n=350, n'=160). The results are summarized in Table 2-6. (See 
also Appendix C.) 

Table 2-6. Proportion of samples in which the true error rate is above, below, or covered by specified 
nominal confidence intervals, l>ased on logarithmic transformation of Jackknife replicate 
estimates. Population A 





Sample size, n/n' 




2400/ 
360 


1200/ 
360 


880/ 
260 


350/ 
160 


All sample 

sizes 
combined 


Nim\l)er of independent replicates 


1900 


400 


400 


2400 


5100 


Proportion of sawcjpHea: 












Below .025 point 


.017 


.032 


.028 


.023 


.022 


Below .05 point 


.035 


.048 


.068 


.049 


.045 


Between .05 and .95 points 


.890 


.890 


.867 


.889 


.888 


Above .95 point 


.075 


.062 


.065 


.062 


.067 


Above .975 point 


.045 


.035 


.040 


.031 


.035 



2-16 



Table of Contents 

-- 

West at, Inc. 



These proportions are considerably closer to the nominal percentages 
than those observed in Table 2-4 for the confidence intervals as currently computed. 
Those below the lower 2-1/2 and 5 percent lower confidence bounds, respectively, 
are reasonably close although they still average somewhat less than the nominal 2- 
1/2 percent and 5 percent; those above the uppjer bounds are moderately greater than 
the nominal 2-1/2 percent and 5 percent. However, the differences, although statis- 
tically significant, are small enough to be of relatively minor concern. These results 
are very encouraging, although some further work is desirable, empirically based on 
transformations other than the logarithmic transformation, which may reduce the 
skewness further. Additional details appear in Chapter 3 and in Appendix C. 



2.5 Some Further Considerations for Estimating Sampling Error 

Current practice in AFDC-QC is to estimate Scm:\pling errors (standard 
errors) of estimated overpayment error rates for a state using only the sample data 
for the current evaluation period for that state. This is consistent with general 
practice. However, as indicated earlier, such estimates of sampling errors are subject 
to large sampling errors, very much larger for a given sample size than in many 
common sampling situations. As illustrations. Table 2-7 shows estimates of the 
coefficients of variation of the estimated sampling errors made by current 
procedures from samples of various sizes drawn from Test Populations A, B, and C. 
Each coefficient of variation is estimated from 1000 samples drawn independently 
for each sample size and test population. 

2 
The estimated coefficient of variation of s^ is 



riooo _ -|i/2 

R -2 

— 1000 
with Sa = ^ Sa / 1000 and i indicating the i-th replicate. 

R I R i 



2-17 



Table of Contents 



Chapter 2. Statiatical Validity of AFDC-QC Methodology 



Table 2-7. Approxiniate coefficients of variation of sg and si from 1000 samples drawn fnam Test 

Populations A, 6, and C for alternate sample sizes,* compared with samples drawn from 
normal distributions 





Sample sizes 




ns2400 


n-»1200 


n = 8a) 


n=350 




n'-360 


n'-360 


n-260 


n'=160 


CV (s^) 










Population A 
Population B 
Population C 


.18 
.20 
.27 


.14 
.16 
.22 


.18 
.18 
.26 


.20 
.24 
.30 


A 2 
CV{s^) 










Population A 
Population B 
Population C 


.34 
.40 
.55 


.29 
.32 
.46 


.36 
.37 
.54 


.40 
.46 
.63 


For a mean of a simple random sample 
of n' drawn from a normal distribution 










CV(s-) 


.037 


.037 


.044 


.056 


CV(sS 

X 


.075 


.075 


.088 


.112 



*The 1000 samples for each sample size from each test population were drawn independently (a simple random 
sample of n drawn from the test population, and a simple random subsample of n' from the siunple of n). The 

2 
coefficients of variation of t± and ■* for a given population and sample size are computed from the same 1000 

^ R 

samples. 



2-18 



Table of Contents 



Westat, Inc. 



Similarly, the estimated coefficient of variation of s^ is 

[1000 2 1 ^/^ 

X (s* - s. ) / lOOoJ 



1000 
with s. = X s*. / 1000 



R T ^* 



The exceedingly skewed distributions of overpayment errors in 
combination with the use of double sampling and the regression estimator result in 
these very large sampling errors of estimated variances and standard errors as 
compared with, for example, the sampling errors of estimates of the variance and 
standard errors of means based on simple random samples of size n' drawn from a 
normal distribution^ (which are also shown in Table 2-7). The large coefficients of 
variation of the estimated variances and standard errors not only result in relatively 
large sampling errors for the estimated overpayment error rates, but also cause 
differences between exact confidence limits (limits that would conform exactly to the 
nominal probabilities) and the confidence limits as currentiy computed. As seen 
earlier (Table 2-4), for the confidence limits as ciirrently computed, the observed 
coverage probabilities in repeated samples from the test populations differ 
somewhat from the nominal 95 and 90 percent probabilities, and differ more widely 
for the upper and lower tails of the confidence intervals considered separately. 



^See Hansen, M., Hurwitz, W., and Madow, W., Sample Survey Method* and Theory, Vol. 1, (John 
Wiley & Sons, New York, 1953), pp. 133-148, where theory is given, with illustrations for simple 
random sampling. The theory and illustrations given there do not cover double sampling with 
regression estimation, for which the impact of skewed distributions is increased. We note, also, that 
technically it is not the skewness of a distribution but, rather, its high kurtosis which causes the very 
large variance of estimated variances. The kurtosis is measured by P « (fourth moment about 
mean)/a*. However, in practice, highly skewed distributioits tend to have high kurtosis, and the 
greater the skewness, the greater the kurtosis. This is strikingly demonstrated in the illustrations in 
the reference cited. Consequently, we prefer to refer to high skewness in characterizing such 
distributions, wtuch is readily seen by the eye, rather than high kurtosis, which is not. 



2-19 



Table of Contents 



Chapter 2. Statistical Validity of AFDC-QC \fethodology 



A particularly serious problem that results from the large coefficients of 

2 

variation of sg is that estimates of the sample size needed to achieve a given level of 

precision for a state can be subject to wide-ranging error. For example, in a state in 
which the joint distribution of state and Federal determinations of overpayment 
error rates corresponds approximately to Test Population C, and with a state sample 
size of 350 and a Federal subsample size of 160, the coefficient of variation of the 

2 

estimated variance, s^, would be about 63 percent (and of s^ about 30 percent). 

We examine what might result when the estimated variance for a state 
is subject to such a large coefficient of variation and is used to determine the sample 
size needed to achieve a given level of precision. Suppose that an estimate is made 
for a state of the sample size needed to achieve an estimate of R subject to a standard 
error of .015. For illustration, we assume that the distribution of overpayment 
errors in the state is like that of Population C. From the known characteristics of 
Population C, we compute that if we retain the ratio of Scmiple sizes n'/n = 160/350, 
a state sample size of n=420 and a Federal subsample size of n»192 would yield such 
a standard error. However, if one estimated the sample size needed on the basis of 

2 

sg estimated from a sample of n'=160 and n=350 (approximately the average annual 

sample size in use in a number of the smaller states) and if the ratio of 
n'/n = 160/350 were retained, one would have roughly 1 chance in 20 that the 
estimates of the Federal and state sample sizes needed would be either as low as 
n'=38 and n^3 or lower or as high as n'=508 and nsllll or higher. Such a range is 
far too wide to provide a useful guide for determiiung needed sample sizes. 

Even for states with large QC sample sizes, the range would be wide. 
For example, for samples of n'a360 and n=2400 drawn from a state distribution like 
that of Test Population C, if this ratio of n' to n is retained, tiiere is about 1 chance in 
20 that the estimates of needed sample sizes would be as low as n'*3S and n-255 or 
lower, or as high as n'=305 and n=2036 or higher.* Of course, the ratio n'/n might 

2 2 
'*The needed sample sizes were computed as follows; n' « S^/o*, with OA set equal to .015, and 

2 2-2 2 

^R ~ ^^x /"^ 'f^ " p (l-n'/n)]«.043 computed for Population C (see Appendix A) and assuming a fixed 



2-20 



Table of Contents 



West at, Inc. 



not be retained for such different sample sizes, but the effect of the wide ranging 
sampling variability would remain. We note that the variance of the estimated 
variance is somewhat larger for Test Population C, which we have used for 
illustration, than for the other two test populations. 



2.5.1 Fooled Variance Estimates 

To reduce the wide sampling variability of the estimated variance of 
the estimate of R, some consideration has been given by AFDC staff to the use of a 
pooled estimate of variance in computing the estimated standard error. We regard 
this as a useful procedure and have developed and evaluated an approach to 
accomplish this. 

We have explored some alternatives that are described in Appendix E. 
A pooled variance estimation procedure that appears to provide acceptable variance 
estimates is one in which the states are first ordered on the basis of preliminary 
pooled unit variance estimates for a prior year or years. We define the preliminary 
estimated unit variance for state k for this purpose as 

tk 

where the symbols are as defined in Chapter 1, with the subscript k added to identify 
state k. 



2 
ratio for n'/n « 160/350. In practice Sj^ is unknown and must be estimated from the sample. The 

2 2 

estimate of Sj^ is n's » as given by Equation (3) in Chapter 1. The observed (not the nominal 

2 
bounds assuming a normal distribution) 2-1/2 percent and 97-1/2 percent omfidence bounds of s^ in 

K 

1000 independent replicate samples of n'sl60 and ns350, drawn from Population C, and also for 
n=360 and n=2400 were used to ot)tain these results. 



2-21 



Table of Contents 



1^. 



Chapter 2. StaHatical Validity of AFDC-QC Methodology 



For this purpose, a uniform value of f=.2 is used for each of the 
51 states. A imple mean of such estimated unit variances for the state for two prior 
years is then computed. The list of states ordered on these average preliminary unit 
variances is then divided into several relatively homogeneous groups (in 
Appendix E, we have used 5 groups with 10 or 11 states in each group). For the 
preliminary unit variance estimates, no use is made of the variance estimates or 
other sample data for the cxirrent year. 

The pooled estimates s^, t^, and rj^ oiS^, 1^, and Pj^, respectively, are 
made for state k in a group of m states as follows (with state i different from state k): 



4 = (2"k4*l'n;^i)/(2"i-Z"i) 

1 1 






X m-1 V y ni-1 

^k = C^n', s,y, + X "1 Kyi) I V2n'k + X °1 T^x ^y^ 



2 2 

and s . is defined the same as s^, but for the Y variable. 



n". 
2 -* 



s:. = m-i)/(-;-o 



n: 
I 



^yi= I^-^)(yij-yi)/n'-i 



n. 
1 



ti = It,j/n;- 



2-22 



Table of Contents 



Weitat, Inc. 



The symbol xjj denotes the Federal determination of the overpayment error for the 
j-th case in the Federal subsample in state i, y^ the corresponding state 
determination of overpayment error, tjj the total payment to case j in state i, and n'j 
the size of the Federal subsample for the year in state i. 

Note that each of the above pooled estimates is a simple weighted 
average of the respective state values, with weights equal to the Federal subsample 
sizes, except that state k, the state for which the pooled unit estimate is being made, 
is given double weight. 

The pooled unit variance estimate for state k is then 

H = (4/<1){i-fk(i-0} 

where f^ = v\.\/r^^ is the fraction that the Federal subsample is of the state sample in 
state k. 

The pooled estimate of the variance of R^ is then 
-2 -2 

This pooled estimate will considerably improve the unit variance 
estimate for state k, provided that the true and ui\known unit variance in each of 
the other states in the group is not too different from S^, the true (unknown) unit 

variance for state k. The improvement results because the pooled estimates are 
made from a much larger sample of cases (about 8 to 14 times as large for an average 
state) as is s^- Of course, the pooled estimate is, in fact, a biased estimate of S^, the 

bias depending on how much the expected values of the true state variances and 
correlations differ from state to state in the group. The anadyses and eveduations in 
Appendix E indicate that very substantial gains result from the use of such a pooled 
variance estimate for purposes of providing a general measure of precision for a 



2-23 



Table of Contents 

-- 



Chapter 2. StaHsHcal Validity of AFDC-QC Methodology 



State. We show in Section 2.5.2 that the pooled variance estimate is not appropriate 
for use in computing lower confidence botmds, but that the direct state variance 
estimates are. 

We note that this particular pooled unit variance estimator involves 
very little computational burden. It simply makes use of unit variances and 
covariances (or correlations) already estimated for purposes of computing direct 
variance estimates for each state. 

It is shown in Appendix E that the simple pooled variance estimates 
evaluated there have moderately higher correlations across states with the true state 
variances being estimated than do the direct variance estimates, state by state. At the 
same time, they have very much smsdler variances, by factors of about 6 to 14. 

The simple pooled variance described here differs from the one 
described and evaluated in Appendix E because the one described here obtains 
weighted averages in which the weight for the specified state is doubled in 
computing the various terms. From the analyses in Appendix E, we tentatively 
conclude that this presumably will result in a small increase in the correlation with 
the true values being estimated, and a small increase in the variance of the 
composite estimate. The differences should be modest, but some evaluation of this 
presumption would be desirable. 

In summary, because of its much smaller variances, and its moderately 
higher correlation with the true values being estimated as compared to the direct 
variance estimates, we conclude that the pooled variance estimator has substcmtial 
advantage in providing general precision measures, and in arriving at the expected 
precision of specified sample sizes. However, it is less usefiil for computing a lower 
confidence bound than the direct variance estimate for a state. 



2-24 



Table of Contents 



Westat, Inc. 



2.5.2 Implications for the Choice of Variance Estimators 

The results just presented, indicating substantial gains from the use of 
a pooled variance estimator for a state, might appear to lead to the conclusion that 
the pooled variance estimator would be superior for all purposes. However, this 
may not be the case. While the pooled variance estimator achieves substantial gains 
for most purposes, there remain applications where direct variance estimation, 
state-by-state, has advantages. We summarize some relevant results in Table 2-8. 

The results presented in Table 2-8 are for four different methods of 
computing confidence intervals. For the "Regular" variance estimator, the 
confidence boimds are obtained by computing K ± ts^ where s^ is the usual direct 

estimate of the standard error of R from the sample for the current year. For the 
"Jackknife-L", the variance is computed from logarithms of Jackknife replicate 
estimates, and the confidence bounds are obtained from the inverse transformation 

of logarithmic confidence bounds, as discussed in Section 2.4 and in Appendix C. 

2 
For the "Known a^" variance estimator, the variance is not estimated from the 

sample. Instead, the confidence interval is computed as KttOR, where the 
parameters of Population A are used in computing a^ (where a§i » S^/n' and S^ is 

given in Footnote 4 in Section 2.5). Of course, the parameters for computing a^ ^^^ 
known for our test population, but would not be known in practice. The results for 
the tmknown true variance are presented to help evaluate the pooled variance 
estimator. For the pooled variance estimator, the confidence botmds are computed 
as for the "Regular," except that the pooled estimate of the variance of R is used, 
obtained by procedures discussed in Section 2.5.1, and evaluated in Appendix E. 

Table 2-8 shows, in the fourth, fifth, and sixth columns, the estimated 
mean, standard error, and coefficient of variation (CV) of the lengths of each type of 
confidence interval. The next two columns show the estimated fnrobability that the 
true population overpayment error rate is, respectively, to the left and to the right of 
the computed confidence intervals. The last three columns show the estimated 
mean, standard error, and coefficient of variation of the lower bounds of the 
confidence intervals. 



2-25 



Table of Contents 



Table 2-8. Properties of alternative procedures for computing of confidence intervals for R, for Population A (see text for description) 



M 







Confidence 


Length 


Estimated 


probability 




Lower bound 




Mean 


Standard 
error 


C.V. 






Mean 


Standard 
error 


C.V. 


Sample size 


Variance estimator 


level 


I 


% 


^n 


R<l.b. 


R>u.b. 


lb 


A 


^/^ 


2400/360 


Regular 


90% 
95% 


00268 
0.0319 


0.0053 
0.0064 


.20 
.20 


0.023 
0.009 


0.090 
0.068 


0.0596 
0.0571 


0.00653 
0.00634 


.11 
.11 




Jackknife - L 


90% 
95% 


0.0270 
0.0322 


0.0054 
0.0065 


.20 
.20 


0.031 
0.017 


0.075 
0.048 


0.0608 
0.0587 


0.00660 
0.00641 


.11 
.11 




2 
Known o^ 


90% 


0.0267 


0.0000 


.00 


0.055 


0.039 


0.0597 


0.00798 


.13 






95% 


0.0318 


0.0000 


.00 


0.027 


0.020 


0.0571 


0.00798 


.14 




"Pooled" 


90% 
95% 


0.0267 
0.0318 


0.0022 
0.0026 


.08 
.08 


NA 
NA 


NA 
NA 


0.0597 
0.0571 


0.00790 
0.00780 


.13 
.14 


350/160 


Regular 


90% 
95% 


0.0499 
0.0595 


0.0105 
0.0126 


.21 
.21 


0.021 
0.006 


0.091 
0.065 


0.0480 
0.0432 


0.01153 
0.01106 


.24 
.26 




Jackknife - L 


90% 
95% 


0.0511 
0.0614 


0.0111 
0.0134 


.22 
.22 


0.042 
0.019 


0.061 
0.040 


0.0518 
0.0486 


0.01169 
0.01021 


.22 
.21 




2 
Knoivn o^ 


90% 


0.0491 


0.0000 


.00 


0.055 


0.042 


0.0484 


0.01488 


.31 




95% 


0.0584 


0.0000 


.00 


0.028 


0.018 


0.0437 


0.01488 


.34 




"Pooled" 


90% 
95% 


0.0491 
0.0584 


0.0042 
0.0051 


.09 
.09 


NA 
NA 


NA 

NA 


0.0484 
0.0437 


0.01460 
0.01430 


.30 
.29 



9 

■a 



6. 
O 

n 
5 



o 
o 



Table of Contents 



Weatat, Inc. 



The first six rows for each sample size in Table 2-8 were obtained by 

drawing 1000 independent samples from Test Population A. The same 1000 

replicate samples were used for computing results for the Regular, Jackknife, and 

2 
known a^ estimators, for sample size n=2400, n'=360, and another independent set 

of 1000 replicate samples was used to obtain the corresponding measures for sample 
size n=350, n'=160. 

In the last two rows for each sample size labeled "Pooled", we provide 
approximate estimates of what would have been obtained had we been able to 
simulate a pooled variance estimation procedure for a set of states similar to 
Population A. These results were obtained as explained in Section 2.5.3. 

We now examine the implications of the alternative variance 
estimators for various uses. 

For computing confidence bounds after the sample results are 
available, it appears from Table 2-8, and from Appendix C (as we explain below), that 
Jackknife-L (i.e., the logarithmic transformation of Jackknife replicate estimates) has 
advantages over the other alternatives considered, even though the estimated 
standard error of the length of the confidence interval is about two and a half times 
greater for this alternative than for the "pooled" variance estimator. Also, the 
standard error of the lower confidence boimd is slightly larger for the Jackknife-L 
than for the Regular. However, the standard error of the lower confidence boimd 
based on the "pooled" variance estimate is about 20 to 40 percent larger than for 
lower bounds based on the Regtilar or Jackknife-L variance estimators. The low 
standard error of the lower confidence bounds based on both the Regular and 
Jackknife-L variance estimators arises because of the relatively high correlation of R 
and its estimated standard error (see Appendix C for fuller discussion). 

For the "pooled" estimator, the probabilities associated with the tails, 

that is, beyond the ends of the confidence intervals, are not available. However,, the 

2 
tails for the "known oa" confidence intervals, which use the population parameters 

instead of sample estimates of Or, give estimates of those probabilities that are quite 



2-27 



Chapter 2. Statistical Validity of AFDC-QC Methodology 



Table of Contents 



good for the tails. Consequently, because the variances of the estimated standard 
error for the "pooled" are much smaller than for the "Regular," we assume the tails 
for the "pooled" might be reasonably close to those for which the known Or is used. 

We conclude that, in spite of the apparent advantages of the pooled 
variance estimator for most purposes, the substantially smaller standard error of the 
lower bound obtained from either the regular procedure or Jackkiufe-L appears to be 
sufficiently important as to lead to the choice of one of these procedures for 
computing the lower bound. Another reason for adopting one of these procedures 
in computing a lower confidence bound is that each depends only on the estimates 
from the sample for the current year. One does not have to justify bringing in other 
data that might be challenged as not completely relevant. The Jackknife-L is 
preferable to the Regular because the frequencies in the "tails" are considerably 
closer to the nominal probabilities than are those for the Regular. In summary, we 
conclude that the Jackknife logarithmic procedure is preferable for computing lower 
confidence bounds that are to be used for such purposes as the determination of 
disallowances if they are to be based on lower confidence botmds. In Section 2.4 and 
Appendix C, we show that it also yields reasonably good resvilts for the upper 
confidence bounds. The "regular" or current procedure for computing lower 
confidence bounds may provide acceptable results for less rigorous uses.' 

The situation is entirely different with regard to estimates of sampling 
errors for other purposes. At the beginning of Section 2.5.1, we showed great 
variability of the "Regular" procedure in making estimates of the sample size 
needed to achieve a given level of sampling error. The range of variability in 
estimating needed sample sizes will be roughly one-sixth as much or less for the 
pooled variance estimator as for the direct or for the logarithmic transformation of 
the Jackknife variance estimator. Similarly, advance estimates of expected sampling 
errors based on results for prior years will be greatly reduced with the pooled 

'You have asked for an estimate of the added cost of computing lower confidence bounds by the 
Jackknife-L procedure as compared with the regular procediue. This cost depends on the computer 
equipment available and on how the job is programmed. A very rough generous estimate based on the 
computing equipment we have used for creating the Jackknife replicates and for computing the 
variances and confidence limits for the test populations is no more than $4,000 for the programming, 
which is a one-time cost for all states and years, and not more than al)out $200 for computer time for 
each state computation. 



2-28 



Table of Contents 

■••- 

VJeatat, Inc. 



variance estimator. These advantages are very substantial. Indeed, it appears 
essential to use a pooled or composite variance estimator in advance variance 
estimation and in planning needed sample sizes. 



Our conclusion is that both approaches have important, but different. 



uses. 



2.5.3 Note on Computation of Characteristics of Confidence Intervals Using 

the Pooled Variance Estimator 

The results presented in Table 2-8 for the "Pooled" variance estimator 
came only in part from the simulations and were estimated as follovsrs. 

The lengths of the confidence intervals for the pooled estimator, t, 

2 
were assumed to be approximately equal to those for "known Oa" since the mean of 

Ik 

the pooled estimates of the standard error of R should be close to the known or. 

2 2 

The a^ for the pooled estimate was assumed to be equal to one-sixth of the o^ for the 

regular estimator. This is greater than the average value of the ratios of variance of 
the pooled estimator (with assumed zero bias) to the variance of the regular 

estimator observed in Appendix E. The mean of the lower bounds, tb, for the 

2 
pooled estimator was assumed equal to the tb for known o^ since the intervals 

would be of approximately equal average length. The estimated standard error of 
the pooled lower bound, <si\,, follows from the fact that the computed lower 

confidence bound for the pooled estimator is tb = R - ts^- Consequently, the 
varicmce of tb is 

4t, = Var(i) + t2Var(s^) -2tp^,SR VVar(R)Var(sR) . 
The Pr^^ is the correlation of R and Sr and was assumed to be equal to VT/To. 



2-29 



Table of Contents 



Chapter 2. Statistical Validity of AFDC-QC Methodology 



This is a rough approximation based on the correlation of x and x+y, 
where y is the sum of a variable y for a simple random sample of n from a specified 
population, and x is the sum of a variable x for an independent simple random 
sample of m, where m/(m+n) equals approximately 1/10. The value 1/10 is chosen 
because the sample for a particular state in a group may constitute roughly one- 
tenth of the sample for the entire group. Fortunately, for the approximate 
relationships that should hold in this case, the a^j, is not sensitive to any of the 
terms but the first one, so that the approximations for agj, should be reasonab' • good. 



2.6 Conclusions on the Validity of the Regression Methodology 

From the above analyses, supplemented by the fuller analyses 
presented in later sections and in the appendices, we conclude: 

• The regression methodology provides unbiased or at most 
triviaUy biased point estimates of the overpayment error rates 
for the AFDC-QC samples in use. 

• The sampling errors estimated from the samples also provide 
nearly unbiased estimates of the sampling errors of the 
estimated overpayment error rates. However, they are subject to 
large sampling errors, much too large to be useful for 
determining needed sample sizes to yield sp)ecified magnitudes 
of sampling errors. 

• A pooled variance estimation procedure is provided that greatly 
improves estimates of variances, and thus of estimates of needed 
sample sizes to achieve specified precision levels. 

• The confidence intervals as now being computed yield results 
that, although imperfect, nevertheless provide useful guides to 
the precision of the point estimates of the overpayment error 
rates. 

• A modified methodology is provided that will yield improved 
confidence intervals, with closer agreement to the nominal 
coverage probabilities, especially in the coverage of the tails. 



2-30 



Table of Contents 



Westat, Inc. 



The point estimates are not affected by imperfections in the 
confidence intervals as now computed. They provide estimates 
of the overpayment error rates that are valid within the ranges 
of error indicated approximately by the computed confidence 
limits. 



2-31 



Table of Contents 

-- 



CHAPTER 3. CONSIDERATIONS IN CHOICE OF LOWER CONHDENCE BOUND 
VERSUS POINT ESTIMATE IN DETERMINING DISALLOWANCES 



3.1 Introduction 

In this chapter we examine various aspects of the second question we 
were asked to consider (see Section 1.1), as follows: 

• What are the considerations and constraints involved 

in the choice of a lower confidence bound versus a 
point estimate in determining disallowances? 

Disallowances are currently computed and assessed annually for states 
with estimated overpayment error rates in excess of allowed tolerances. As 
explained in Chapter 1, the allowed tolerances are specified in legislation. They vary 
from state to state for years prior to 1984, and are set at 3 percent for 1984 and 
thereafter. The disallowance for a state is D = (R - Ro)A, provided R is greater than 
Rq, where R is the QC regression estimate of R (the true overpayment error rate for 
the year), Rq is the corresponding tolerance or target rate (the terms "tolerance" and 
"target rate" are used interchangeably), and A is the amoimt of the Federal payment 
to the state for the year. Under certain circumstances, the disallowance can be 
suspended or waived by the Secretary of Health and Human Services. 

The assessment of disallowances has led to challenges and suits by 
some of the states, and some have proposed that, because the estimated error rates 
are subject to sampling errors, a lower confidence bound of R should be substituted 
for R in computing the disallowance. This alternative has also been considered by 
the Congress. Consequently, it is appropriate to examine and compare the statistical 
implications of these and other alternatives. 

There are important precedents for the use of either the point estimate 
or a lower confidence bound in various applications of sampling. The choice 



3-1 



Table of Contents 



Chapter 3. Contidtnttiont in Choice of Lower Confidence Bound Vertu* Point Ettimat* in Determjninf DisallowaHct* 



should be guided by the purposes to be accomplished by the assessment of 
disallowances and is primarily a policy decision, rather thar a statistical one, and 
depends on the goals t. be served, as discussed in poin. S) through (13) in 
Section 1.2. However, it has impxjrtant statistical implications that we will examine 
in this chapter. We note again, here, that in practice the p>oint estimate is ordinarily 
and appropriately used where two parties to a funds transfer or payment are 
involved, and the amount of the payment is determined by a sample estimate. 
Such applications of samples and the point estimate generally call for samples large 
enough to yield reasonably precise estimates. Use of a lower confidence bound 
would result in a disadvantage to one party to the advantage of the other, A lower 
confidence bound is more likely to be appropriate if the purpose of a sample esti- 
mate is to prove carelessness or fraud, such as in auditing, and the consequence may 
be an assessment of a penalty. In AFDC, the Tax Equity and Fiscal Responsibility Act 
(TEFRA) of 1982 has been interpreted as requiring use of the point estimate. 

When samples are large enough, the difference between the two 
approaches is reduced, and ultimately, for large enough samples, the difference 
becomes relatively small. However, the differences are relatively large for the sizes 
of annucd AFDC samples in use. Since large trsmsfers of funds are involved, an 
understanding of the statistical implications of the alternatives is desirable. We 
consider this in Section 3.2. We refer to the use of the point estimate in computing 
annual disallowances as Rule A, and to the use of the lower confidence bound as 
Rule C. Rule B is a variant of Rule A ~ the annual disallowance is based on the 
point estimate except that the disallowance is waived if the nominal 95 percent 
lower confidence bound of the error rate is below the tolerance. Rule B will, of 
course, result in lower disallowances, on the average, than Rule A, because they are 
waivec when the estimated error rate is above, but within likely sampling error 
range, of the target 

Later (in Section 3.7), we describe still another rule. Rule D. This rule 
increases the effective sample size for computing disallowances by accumulating the 
annual disallowances over successive years. The lower confidence bound of the 
accumulated disallowances is used for computing cash disallowances to be assessed 
until the sampling error of the total accumulated disallowance is sufRdently small. 



3-2 



Table of Contents 



Wtstat, Inc. 



The accumulated disallowance based on the point estimates is then used for final 
settlement. 



3.2 Use of Point Estimate Versus Lower Confidence Bound in Computing 

Annual Disallowances 

Table 3-1 illustrates the consequences of using Rules A, B, and C for 
computing disallowances for alternative values of the excess of the overpayment 
error rate over the tolerance (column 1), the assumed standard error of the 
overpayment error rate (column 2), and the size of the Federal payment (column 4). 
The correct disallowances (computed using the unknown true error rate) for each 
case are shown in column 5, and the average over all possible samples of 
disallowances computed with Rules A, B, and C are shown in columns 6, 7, and 8. 
The coefficients of variation of the disallowances computed with Rule A are shown 
in column 9. The tigures in the table are approximations based on the assumptions 
stated in the Notes for Table 3-1. The figtu-es in columns 9 through 12 are of 
principal interest, and apply for any level of the Federal payment to states that have 
(approximately) one of the seven assumed excess of error rates over tolerance 
shown in column 1 and one of the two levels of sampling error shown in column 2. 

While the figures in columns 9 through 12 of Table 3-1 are 
approximations, and are not those for any specific states, they are approximately 
representative of the situation in fiscal year 1984 for many states. For all large states, 
the sizes of the Federal QC samples are roughly the same, and the state QC samples 
vary from about 1200 to 2400. The .006 standard error of j^ assumed in Table 3-1 is 
roughly representative of the average sampling error in 1984 for these states 
(although the sampling error tends to be somewhat smaller for states with the larger 
state samples, and somewhat larger for the others). The sampling error of .012 
shown in the bottom deck of Table 3-1 is roughly illustrative of a number of 
medium-sized and smaller states (states with state samples of about 500 to 800). 

Column 6 of Table 3-1 iUustrates that on the average (over all possible 
samples) disallowances computed by Rule A are closely equal to the correct 



3-3 



Table of Contents 



Table 3-1. Some illustrative approximate average results over repeated samples for annual disallowances computed by Rules A, B, and C* 













Average of actual disallowances 




Ratio of average actual 








Amount 


Correct 


for Rules A, B, and C 




to correct disallowance 


Excess of 
















overpayment 






of Federal 


disallowance 








CV of actual 








error rate over 


Standard 


R-Ro 


payment A 


D=(R-Ro)A 


Da 


Db 


Dc 


disallowances for 








target (R-R<p 


A 

error of R 


°R 


($000) 


($000) 


($000) 


($000) 


($000) 


Rule A {<%JD^) 


D^/D 


Db/D 


Dc/D 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


(9) 


(10) 


(11) 


(12) 


.08 


.006 


13.3 


500/X)0 


40,000 


40,000 


39,150 


35,065 


.075 


1.00 


.98 


.88 


.05 


.006 


8.3 


500,000 


25,000 


25,000 


24,450 


20,065 


.12 


1.00 


.98 


.80 


.03 


.006 


5.0 


500,000 


15,000 


15,000 


14,700 


10,065 


.20 


1.00 


.98 


.67 


.02 


.006 


3.3 


500,000 


10,000 


10,000 


9,700 


5,092 


.30 


1.00 


.97 


.51 


.01 


.006 


1.7 


500W) 


5,000 


5,050 


3,600 


1,084 


.57 


1.01 


.72 


.22 


.003 


.006 


.5 


500,000 


1,500 


2,100 


550 


119 


1.07 


1.40 


.37 


.08 


.0 


.006 


0.0 


sooxwo 





1,200 


150 


32 


1.46 


oo 


OO 


DO 


t .08 


.006 


13.3 


100,000 


8,000 


8,000 


7330 


7,013 


.075 


1.00 


.98 


.88 


.05 


.006 


8.3 


100,000 


5,000 


5,000 


4,890 


4,013 


.12 


1.00 


.98 


.80 


.03 


.006 


5.0 


100,000 


3,000 


3,000 


2,940 


24)13 


.20 


1.00 


.98 


.67 


.02 


.006 


3.3 


100,000 


2,000 


2,000 


1,940 


1,018 


.30 


1.00 


.97 


.51 


.01 


.006 


1.7 


lOOXXX) 


1,000 


1,010 


720 


217 


.57 


1.01 


.72 


.22 


.003 


.006 


.5 


100,000 


300 


420 


110 


24 


1.07 


1.40 


.37 


.08 


.0 


.006 


0.0 


100,000 





240 


30 


6 


1.46 


OO 


oo 


OO 


.08 


.012 


6.7 


15,000 


1,200 


1,200 


1,175 


904 


.15 


1.00 


.98 


.75 


.05 


.012 


4.2 


15,000 


750 


750 


734 


454 


.24 


1.00 


.98 


.61 


.03 


.012 


23 


15,000 


450 


450 


417 


168 


.40 


1.00 


.93 


.37 


.02 


.012 


1.7 


15,000 


300 


303 


214 


66 


.57 


1.01 


.71 


.22 


.01 


.012 


.8 


15,000 


150 


164 


57 


16 


.88 


1.09 


.38 


.11 


.003 


.012 


.25 


15,000 


45 


96 


14 


4 


1.24 


2.13 


.31 


.09 


.0 


.012 


0.0 


15,000 





72 


6 


2 


1.46 


oo 


OO 


OO 



5 

n 
1 



2 
S 



*See Notes for Table 3-1 for definitions. 






Table of Contents 



Notes for Table 3-1 

The rules are defined as follows: 

Rule A: D/^ = (R - Ro)A if positive; otherwise D^ = 0. 
RuleB: Db = (R-RipAi'R -1-645sr >Ro; otherwise Dg = 0. 
Rule C: Dc = (R- 1.6458R - Ro)A if positive; otherwise Dc = 0. 

— A 

The D^ is the average of D^, etc 

For each rule, sr is the estimate of the standard error of R and Rq is the target error rate. The computations shown in the table depend upon the 
following assunqptions for each modd. 

A ^ 

For Rule A, Ae computations assume that R is normally distributed and that R is an unbiased estimate of the true error rate R. 

For Rules B and C, the computations assume that the Joint distribution of R and Sr is normal and that they are both unbiased estimates. It is assumed 
that the correlation of R and Sr is .7 (which is approximately the average correlation observed in simulations for Test Populations A, B, and C (see 
AppenduC, Table C-1), and that the variance of 8R is (P-m^/^n'- We have taken M/ n=360 when a^« .006, and n=160 when o^ =012. The 
P=40 is an approximate average value obtained for Test Populations A, B, and C from the assunwd relationship 

2 



and (^ « no '^ were each obtained from 1000 replicated independent samples (see Appendix C). 



Table of Contents 



Chajjier 3. Conaiderations in Choice of Lower Confidence Bound Vtnut Point Estimmtt in Dttgrmininf DiuiUowancea 



disallowances unless (R-Ro)/c^ is small, say less than about 1.5. It also shows 

relatively how much disallowances would be overestimated, on the average, when 
(R - R^)/a^ is small. It shows, for example, that if R - Rq is .01 or greater, and if c^ is 

approximately .006, the computed disallowance under Rule A will, on the average, 
be equal or very nearly equal to the correct amount. On the other hand, for a state 
with a^ = .006, and an excess of the overpayment error rate over the target of only 

about .003, the average aimual disallowance would be 40 percent above the correct 
disallowance (column 10), and for a state with a^ = .012, the average annual 

disallowance would be more than twice the correct disallowance. 

Rule B is the same as Rule A except that no disallowance is assessed 
unless there is strong evidence that the true error rate is above the target. More 
specifically, vnth Rule B, the disallowance is 






(r-Ro)a ifR-ts* >R 
" otherwise 



with t = 1.645 if a nominal 5 percent point (the lower bound of the nominal 
90 percent confidence interval) is to be used. Alternatively, a lower confidence 
bound would be computed using the log-Jackknife-replicate procedure described in 
Section 2.4, which yields a probability associated with the lower confidence bound 
that is considerably closer to the nominal probability. 

It is seen from Table 3-1 (column 11) that the use of Rule B avoids the 
overassessment of disallowances that results, on the average, h'om Rule A when the 
overpayment error rate is dose to the tolerance. Instead, Rule B very slightly 
under assesses the disallowances, on the average, when (R - Ro)/ag is large and, as 

expected, underassesses them considerably when the sampling error of R is large 
relative to the excess of the overpayment error rate over the target 

We have also evaluated the application of Rule B by using the 
simulated samples drawn from the Test Populations A, B, and C, and using the 



3-6 



Table of Contents 



WMfait, Inc. 



criterion (R - 1 .645s^) > rather than the suggested log-Jackknife-replicate transfor- 
mation. The results are presented in Table 3-2. We conclude from Table 3-2 that for 
the three test populations the application of Rule B, using sample estimates of R and 

Or, gives quite satisfactory results, i.e., the Dg/A in each case is close to R - Rq, except 
for the smallest sample size. For the smallest sample size for Test Population C, 
especially, the ratio of average computed to correct disallowance (last column) is 
sufficiently small to result in underestimation of discdlowances by about 10 percent. 
The ratios in the last column of Table 3-2 are reasonably dose to and confirm the 
corresponding approximate ratios in column 11 of Table 3-1, for comparable values 
of (R-Ro)/or. Of course, the results presented in Table 3-2 are averages from 1000 
independent replicate samples and are subject to some sampling variability. 

A 

The coefficients of variation (CV) of the D^ for the illustrative samples 
are shown in column 9 of Table 3-1. It is seen that the CV increases rapidly as the 
excess of the overpayment error rate over the tolerance decreases. 

For Rule A, the magnitude of the sampling errors relative to the 
disallowances (illustrated by the "CV of actual disallowances" shown in colunm 9 of 
Table 3-1) has been the basis for a concern expressed by some states that the amount 
of the disallowance may vary widely due to sampling error. This concern has led 
some of the states to propose the adoption of Rule C for computing disallowances, 
i.e., that disallowances be computed by using a lower confidence boimd instead of 
the point estimate. The consequences of doing this for a one-tailed 95 percent 
confidence bound (i.e., a lower 90 percent two-tailed confidence boimd) are 
illustrated in colvuims 8 and 12 of Table 3-1. If such a lower confidence bound were 
adopted, the disallowance for a state would rarely exceed the correct value, and then 
only by a relatively small amount Also, as seen in Table 3-1, the average of such 
disallowances would be below, and often far below, the correct disallowance. 



3-7 



Table of Contents 



Chapter 3. Considerations in Ckoict ofLmoer Confidenct BounJ Venua Point Estiftuite in Determining Disallowancea 



Table 3-2. Average annual disallowances, computed for Rule B, for specified sample sizes (based on 
1000 independent samples from each test population, and assuming tolerance of Rq=.03) 





Standard 


A 

error of R 


Disallowances* 










Average 




Ratio of 








Proportion 


computed 


Correct 


average 








of samples 


disallowance 


disallowance 


computed to 








with 


proportion 


proportion 


correct dis- 








A 


.^ 




allowance 


Sample size 


o6 


(R-.03)/o6 


Db>0 


Db/A 


R-.03 




Test Population A 














(R = .0730) 














2400 360 


.0071 


6.1 


1.000 


.0431 


.0430 


1.00 


1200 360 


.0079 


5.4 


1.000 


.0424 


.0430 


.99 


880 260 


.0093 


4.6 


0.999 


.0426 


.0430 


.99 


350 160 


.0129 


3.3 


0.957 


.0422 


.0430 


.98 


Test Population B 














(R = .0795) 














2400 360 


.0071 


7.0 


1.000 


.0489 


.0495 


.99 


1200 360 


.0087 


5.7 


1.000 


.0490 


.0495 


.99 


880 260 


.0103 


4.8 


1.000 


.0487 


.0495 


.98 


350 160 


.0152 


3.3 


0.984 


.0490 


.0495 


.99 


Test Population C 














(R = .0662) 














2400 360 


.0079 


4.6 


0.997 


.0359 


.0362 


.99 


1200 360 


.0088 


4.1 


0.996 


.0360 


.0362 


.99 


880 260 


.0103 


3.5 


0.976 


.0352 


.0362 


.97 


350 160 


.0143 


2.5 


0.791 


.0326 


.0362 


.90 



•See Table 3-1 and Notes for Table 3-1 for definitions. 



3-8 



Table of Contents 



Wtitat, Inc. 



In Section 3.7 we describe an alternative procedure. Rule D, for 
computing and assessing disallowances that may have advantages over assessing an 
annual disallowance solely on either the point estimate or a lower confidence 
bound. Before doing this, however, we review some of the implications of using a 
lower confidence bound rather than the point estimate in computing disallowances. 
These issues include choice of a probability to associate with a lower confidence 
bound, improved procedures for computing lower confidence bounds, the 
comparative precision of the lower confidence bounds and the point estimate, a 
procedure to avoid a concern that poor-quality work on QC in a state could work to 
the disadvantage of the Federal government by lowering the lower confidence 
bound, and some limited discussion of optimum sample size considerations. 



3.3 Some Implications and Issues Concerning Use of the Lower Confidence 

Bound 

We comment here on a few points that are relevant if the lower 
confidence boimd is to play a role in the computation of disallowances, whether 
based on Rule B or C discussed above, or on Rule D described later (Section 3.7). 



3.3.1 Choice of Nominal Confidence Level 

The term "nominal confidence level" refers to the desired probability 
that a confidence interval include the true value that is being estimated. The actual 
probability may differ from the nominal, although, with appropriate sample design 
and sufficient sample size, the actual and nominal probabilities may be reasonably 
close together. For this discussion, we assume they are equivalent. The issue to be 
considered is at what level the probability associated with a confidence interval, or 
with an upper or lower confidence bound, is to be specified. 

We assume that a 90 percent confidence interval is defined in such a 
way that a 5 percent probability is associated with each tail, that is, the lower 
confidence bound is such that the probability is about 5 p>ercent that it exceeds the 
value being estimated (which we refer to as the true error rate), and the upper 

3-9 



Table of Contents 



Chapter 3. Considerations in Ckoict of Lower Confidence Bound Vertua Point Eitimate in Dttermining Diiallowancta 



confidence bound is such that the probabihty is about 5 percent that it is below the 
true error rate. Similarly, for a 95 percent confidence interval, the probabilities are 
about 2-1/2 percent that the lower bound exceeds and are also about 2-1/2 percent 
that the upper bound is below the true error rate. The higher the specified 
probability for inclusion of the true value witnin the confidence interval, the lower 
is the probability associated with each tail. However, a choice must be made of the 
confidence level to be used; this is a policy decision. 

We note that while practice does and should vary, depending on the 
circumstances and policy judgments made, in much statistical practice 95 percent 
confidence intervals are displayed and used as measures of precision. Also, the use 
of a 95 percent confidence level has been the conunon practice in computing two- 
tailed confidence intervals to provide measures of precision in AFDC. While there 
is no necessary reason for adopting the same probability level for computing a lower 
one- tailed confidence bound, it seems reasonable and is common practice to do so. 
In a number of analyses, we have displayed both 90 and 95 percent two- tailed 
confidence intervals, and corresponding 95 percent (or 5 percent) and 97-1/2 percent 
(or 2-1/2 percent) lower (and upper) confidence bounds. We have adopted a 
95 percent lower confidence bound more generally for illustration (or a 95 percent 
upper confidence bound in some instances) because it seems to represent the most 
common practice and is consistent in probability level with the level in use in 
AFDC for measuring precision. However, to the extent that lower confidence 
bounds have a role in computing disallowances, the adoption of a confidence level 
can have a substantial impact on the resulting magnitude of the disallowance, and 
consequently the choice of an appropriate probability level should be a matter for 
policy determination. 



3.3^ Improved Procedures for Computing Confidence Bounds 

Another issue concerns the way in which the confidence interval, and 
therefore its lower bound, are computed. The present procedure in AFEKZ in 
computing a lower confidence bound, L, is 

L = R - tsg 



3-10 



Table of Contents 



Wtatat, Inc. 



using the formulas given by Equations (1) and (3) in Chapter 1, respectively, for 
estimating R and s^, and using t = 1.% for a 95 percent confidence interval and for a 
97-1/2 (or 2-1/2) percent lower confidence bound. Alternatively, we have suggested 
above, for consideration, the use of t = 1.645 for a 95 (or 5 percent) percent lower 
bound. As we have shown earlier (Section 2.3), with the highly skewed distribution 
of overpa)m:\ent errors, the probability that the lower bound is greater than the true 
error rate is much less than the nominal 2-1/2 percent. We have also shown that 
the results are similar for the lower bound of a 90 percent confidence interval (i.e., 
for a 95 percent lower confidence bound). In Section 2.4, we have suggested the use 
of a log-Jackknife replicate method of computing confidence intervals which, on the 
basis of the analyses we have completed, provides probabilities considerably closer to 
the nominal levels. As noted before, the results are encouraging, although fturther 
work on the problem is desirable, particularly in the search for even more useful 
transformations. 

We also note that the computation of confidence intervals using the 
log-Jackknife-replicate method involves more computing than if computed by the 
simpler procedure, but with present computer speeds and costs, the difference seems 
to be unimportant in relation to the potential impact on disallowances if based on a 
lower confidence botmd (see footnote in Section 2.5.2). 



3.3.3 Comparative Precision of Lower Confidence Bound and Point Estimate 

In Section 2.5.2 of this report and in Section D.l of Appendix D we 
explain why the lower confidence bound of the overpayment error rate has 
considerably greater precision than the point estimate, contrary to the usual 
situation. We illustrate the comparisons for three test populations. The principal 
relevance to this discussion is that possible questions concerning the precision of the 
lower confidence bound do not mitigate against its use in computing disallowances. 



3-11 



Table of Contents 



Chapter 3. Contiderationa in Choice ofLomtr Confideiict Bound Versus Point EttimmU in Determining Diaallowancta 



3.3.4 Controlling Impact of Sample Sixe and of Poor-Quality QC Work on 

Lower Confidence Bound 

Another problem with the use of the lower confidence bound in 
computing disallowances is that it can be lowered by decreasing the sample size or by 
lowering the quality of the QC reviews done by the state. The first of these effects 
can be controlled by insistence on minimum sizes for the state sample and the 
Federal subsample. Some discussion of the implications of alternative sample sizes 
appears in this subsection and in Appendix D, and also in Sections 3.4 and 3.5. 

It is easier to control sample size than the quality of QC work. The 
presence of poor quality work can reasonably be suspected by an tmusually low 
correlation between the state and Federal findings for the cases in the Federal 
subsample. An unusually low correlation, or continued observation of a 
moderately low correlation (say below .8 or .85) may call for more intensive 
monitoring of the state's QC operation. The distributions of correlations due to 
sampling, and the distribution of estimated correlations by states, are given in 
Appendix D. A study of such distributions, along with updating of such analyses 
from time to time, can provide insight into correlations that may be lower than can 
be expected from sampling variability alone. 

The impact of low correlations on lower confidence bounds of 
overpayment error rates can be reduced substantially by adopting a "minimum 
correlation variance estiniator." This is accomplished whenever the estintated 
correlation in the formula for the variance (See Chapter 1, Equation (3)) is below a 
specified minimum value, say .8, by replacing the correlation in the formula by the 
specified minimum value. This decreases the estimated sampling error in such 
instances, thus increasing the computed lower boimd. Such low correlations may 
occur because of poor-quality QC work, or because of sampling variability. 
Whichever is the cause, the adoption of the minimum correlation variance 
estimator provides a reasonable adjustment without having any effect on the f>oint 



3-12 



Table of Contents 



Westat, Inc. 



estimate. The selection and use of such a niinimum value is discussed in 
Appendix D.i 



3.4 Some General Considerations on Optimum Sample Size 

We note first, and strongly emphasize, that, except for a few 
introductory remarks, this discussion of optimum QC sample size assumes that the 
only role of the QC sample is that of computing disallowances, whereas the 
principal reason for initiating the QC sample and a principal reason for its 
continued use is to provide information on the frequency and magnitude of errors 
and their sources, in order to guide improvement and control of the administration 
of AFEXZ. Effectively serving these purposes is an exceedingly important role of 
AFDC-QC. It is obvious that the payoff through reductions in misspent funds can be 
very great indeed if overpayment error rates are substantially reduced through such 
efforts. We note, for example, that the reductions in error rates in recent years (e.g., 
1980 through 1984) have been substantial, involving reductions of many millions of 
dollars in improper overpayment of AFDC benefits. 

Prestunably, an important part of these reductions has resulted directly 
and indirectly from QC efforts in the states. Nevertheless, the optimum sample 
sizes needed for guiding improvements in the design and administration of AFDC 
are not easily detennined. We do not here attempt to make that determination in 
an objective way, but we do emphasize that the sample, for this purpose, should be 
large enough to facilitate reasonably precise analyses by population subgroups. 
These should include important subclasses of recipients, so that the sample would 
provide separate estimates for those working and not working, those with or 
without other income sources, and other subgroups, and also for major geographic 
subdivisions. The latter may help in comparing administrative etfectiveness within 
different operating units within the state units. These types of analyses are 

^From Appendix D, Table D-2, it is seen that the observed correlations for states have been increasing. 
The 30th percentile of the esdmates of correlations by states increased from .76 in 1961 to .87 in 1984. 
From these it seems that, until additional evidence is available, the choice of a minimiim r of .80 to 
.85 would be quite reasonable. Presumably, the lower values of the estimated correlations by states in 
the table reflect to a considerable degree the consequences of sampling variability (see Figures D-2A, 
B, and O. 



3-13 



Table of Contents 



Chapter 3. Considerationa in Choice of Lower Confidence Bound Venus Point Eatimatt in DetermiHing Ditallowancea 



important and necessary, but it is not easy to specify the sample size needed for such 
analyses. These analyses are to be done primarily ith the state samples which are, 
of course, considerably larger than the Federal subsamples. For analyses by various 
subclasses, it may be useful to accumulate samples over two or three years, and also 
to plot control charts for subclasses based on quarterly or more frequent QC results. 
The role of the Federal subsamples in this regard is simply to monitor the state QC 
efforts so that the state samples will be reasonably effective in identifying sources of 
errors by type. 

One of the important considerations concerning the sample sizes that 
are needed to provide information for corrective action (and also for computing 
disallowances) is that when a cate welfare system is "under control," that is, it has 
reduced its overpayment error rate in total and in the major jurisdictions or 
subclasses to an acceptably low level, perhaps to or below the current three percent 
tolerance, there may be little to gain from additional efforts at corrective action (and 
nothing to gain from disallowances). Consequently, it seems reasonable for such a 
state to reduce the QC program to a monitoring role, primarily to provide assurance 
that the overpayment error rate does not rise substantially again. This could be 
done with relatively small sizes of state and Federal samples (for example, perhaps 
300 to 600 for the state sample and 150 for the Federal subsample). 

We mention one other consideration with regard to sample size: any 
effort to optimize sample size through a cost-benefit approach must take account of 
the total expenditures involved. The exception is the case mentioned in the 
preceding paragraph, where the administration of AFDC is demonstrably under 
good control. 

From a cost-benefit point of view, it may be worth using only a 
relatively small QC sample in the smaller states. Cost-benefit considerations call for 
higher precision and greater detail for large states. Large samples can provide 
analyses at shorter time intervals, or by major administrative areas, or for 
population subgroups, and may greatly facilitate identifying problems and taking 
corrective action. In New York, for example, in fiscal year 1984 the cost of AFDC was 
$957 million, while in Wyoming it was about $6 million, or about 6/10 of one 
percent of the New York cost. Delaying or failing to take effective corrective action 

3-14 



Table of Contents 



Westat, Inc. 



in Wyoming could not noticeably impact total erroneous expenditures in the AFDC 
program, whereas delay or ineffective action could be enormously costly in New 
York (and in each of a number of other large states). It would be totally cost- 
ineffective to call for equal sample sizes or equal precision in these two states ~ too 
costly to take a large sample in Wyoming, and large losses would be risked if a small 
sample were used in New York, at least until the error rate is acceptably low. 

We thiiUc the need for larger samples in the larger states is reasonably 
obvious from a cost-benefit point of view without further comment and 
justification. The analysis in Section 3.5 of optimum sample size for determining 
disallowances using a lower conHdence boimd provides a rather striking illustration 
of this point. 



3.5 Optimum Sample Size for Computing Disallowances 

We now ttum to consideration of optimum sample size when the sole 
purpose of QC is assim\ed to be the computation of disallowances, and the goal is to 
minimize the overall cost to the Federal govenunent of overpayment errors in the 
AFDC program, taking joint accoimt of the cost to the Federal government of QC 
and of the retiims from disallowances. 

When the point estimate is used to compute disallowances (Rvile A) it 
is not feasible to determine objectively an optimum sample size based on expected 
(or average) results. This is because, whatever the sample size, the sampling errors 
of the estimates of the overpayment error rates are both positive and negative, and 
when the estimated error rate is used in the computation of the disallowance, the 
long-run average effect of the sampling error in the estimation of disallowance is 
close to zero for high error rates and is a decreasing function of the sample size. If 
the true error rate is dose to the target rate, the average of the disallowances is 
positive (as discussed earlier), and increasingly so, as the sample size is decreased. 
Consequently, it is no longer true that there is an approximately equal chance of 
positive and negative errors. However, it is still true that the Federal government 
gains more, on the average, as the sample size is decreased, since the average 
expected disallowance is larger. (See Appendix F.) 

3-15 



Table of Contents 



Chapter 3. Considerations m Choice of Lower Confidtnct Bound Venu* Point EsUmmte in Detertmniitg Disallowanca 



Thus, from a simplistic point of view, if the point estimate is used, the 
optimum sample size is to make the state sample and the Federal subsample as 
small as possible (like a san\ple of 2), and still make it possible to make an estimate. 
Of course, this is ridiculously small; neither the Federal government nor the state 
would be willing to deal with such a ridiculously small sample. It just means that 
we do not have a basis for obtaining an optimum sample size based jointly on cost 
and expected or average return from disallowance. 

One might make some asstimptions about the cost of errors in the 
point estimate that result in much too large a disallowance in some years, and much 
too small in others, and possibly arrive at an optimum based on the costs and 
disadvantages of such variability. We have not taken this approach here, because it 
does not appear very promising, at least at the present stage of this analysis. We 
conclude that the determination of optimum sample size for computing 
disallowances by Rule A is a judgment decision, not effectively guided by a 
mathematical solution, at least for the present. 

The situation would be quite different if the lower confidence bound 
were to be used in computing disallowances. In this case, from the Federal point of 
view, the larger the samples for a state, the smaller the sampling error, and 
therefore the higher the average disallowance. But to achieve a larger sample costs 
additional Federal funds, both for the Federal subsample and for the state sample. 
Under these circumstances, it is possible to determine the sample size that 
maximizes the Federal return. This is done in Appendix G where details are 
presented. We summarize some results here. 

In this analysis it is assumed that the Federal costs for QC include half 
of the cost of the state QC sample, and the full cost of the Federal QC sample. We 
used, for determining unit costs, the costs and caseloads quoted in a memorandum 
from OFA outlining a meeting on September 4, 1984, with the Ways and Means 
Staff regarding the AFDC Quality Control System and Error Rate Disallowances.^ 

^Memorandum to Debbie Chassman from Barbara Levering, Department of Health and Human 
Services, Office of Family Assistance, Social Security Administration, dated August 31, 1984, 
September 4 Meeting with Way$ and Mean* Staff on AFDC Quality Control System and Error Rate 
Disallowances and attached outline on Briefing Points for Ways and Means Staff. 



3-16 



Table of Contents 



V/tstat, Inc. 



The resulting assumed unit costs were $130 Federal cost (1/2 total unit cost) of the 
state sample per case, and $330 per case for the Federal subsample. We also assimied 
a target error rate of 3 percent, as called for in 1984 and afterwards by present 
legislation. Various levels of total Federal payments were assumed that are 
illustrative of payment levels in the various states. We also assumed that the 
Federal subsample size was 15 p>ercent of the state sample size, as it is in some of the 
larger states. The computations could readily be carried through for other 
subsampling fractions, and would yield sinular results. We also assumed three 
levels of the standard deviation of the pa)anent errors, that the correlation of state 
and Federal findings was .9, and that the correlation of R and Sr was .8.^ Given the 
above assimiptions, we obtained the summary results displayed in Table 3-3. 

Table 3-3. Approxinuite optimum Federal sample sizes (n') for computing annual disallowances l>ased 
on a lower confidence lx>und (Rule C), for alternative levels of total Federal payment, and 
of excess of overpayment error rate over the target rate 





Standard 


Excess of payment error rate over target 


Size of Federal 












payment 


deviation of 












($1,000,000) 


payment errors 


.01 


.02 


.03 


.04 


.06 


20 


30 






84 


84 


84 




50 


— 


-- 


117 


117 


117 




70 


— 


— 


140 


147 


147 


50 


30 


_» 


154 


154 


154 


154 




50 


— 


215 


217 


217 


217 




70 


— 


239 


271 


271 


271 


300 


30 


510 


510 


510 


510 


510 




50 


673 


716 


716 


716 


716 




70 


545 


800+ 


800+ 


800+ 


800+ 


500 


30 


716 


716 


716 


716 


716 




50 


800f 


800+ 


800+ 


800+ 


800+ 




70 


80O4- 


800+ 


800+ 


800+ 


800+ 



^Elsewhere we have assumed .7 for this correlation (see, for example. Appendix E). This .8 assumption 
here was based on early results. We have not regarded it as worthwhile to recompute assuming a 
correlation of .7. 



3-17 



Table of Contents 



Chapter 3. Considerations in Ckoict of Lower Confidence Bound Venua Point Ettjmmtt in DetermJning Disullotpances 



We note that the optimum Federal sample size becomes zero (denoted 
by "— " in the table) as the excess of the overpayment error rate over the target gets 
small. This means that, in such instances, the amount recovered in disallowance is 
equal to or less than the Federal cost of QC sampling. On the other hand, the 
optimum sample sizes increase and become considerably larger than the present 
Federal subsample sizes as the excess of the overpayment error rates over the target 
increases, and as the total Federal payment becomes large. (Note that an entry of 
800+ in the table signifies that the optimum Federal sample size is greater than 800. 
Our computation did not extend beyond that size.) We emphasize, again, that this 
optimization is for separate computation of disallowances each year, using the lower 
confidence bound in the computations (Rule C), and that the optima are computed 
only to maximize net retxim from disallowances to the Federal government. 

From the point of view of a state (instead of the Federal government), 
the effect of jointly minimizing a state's cost of conducting the (2C operation and its 
losses from disallowances is totally different. Obviously, if a lower confidence 
bound is used to compute disallowances, the optimum size of a state sample is the 
smallest that it is permitted to use, for this would increase the sampling error and 
therefore lower the lower confidence bound and the disallowance. It would 
simultameously reduce the cost of QC. 



3.6 The Impact in FY 1981 of Three Disallowance Rules - Rules A, B, and C 

For fiscal year 1981, disallowances were assessed against 27 states and 
Puerto Rico (see Table 3-4). Waivers were granted in six of those cases. The 
disallowances assessed were computed by Rule A, that is, 

D = (R - Rq) A, if positive, 

where Rq and A vary from state to state. (For the states of Arizona and Texas, a 
somewhat different and more complex computation was used, but the ditference is 
not relevant to this discussion.) 



3-18 



Table of Contents 



Weitat. Inc. 



Table 3-4 presents the assessed disallowances for Rule A. It also 
presents, for comparison, what they would have been if computed by Rules B or C 
(as described in Section 3.2). Rule B computes the disallowances as in Rule A, except 
that if the 95 percent lower confidence bound is less than the target level, the 
disallowance is waived. The lower confidence bound is computed as R-1.645sg 

A 

where R and s^ are computed by the current procedures (Equations (1) and (3) in 
Chapter 1 except for states that use a stratified sampling estimator). 

Rule C bases the disallowance on the lower bound alone, as has been 
suggested by some. That is, the disallowance is computed as the excess of the lower 
bound over the target rate, applied to the Federal payment 

D = (R-1.645s^-Ro) A, if positive. 

The totals for all 27 states are shown for each rule, as well as the totals 
reduced by the amotmts for the states for which the disallowance was waived. Thus, 
after waivers, the total disallowance is 17 percent less for Rule B than for Rule A, 
and is 58 percent less for Rule C than for Rule A. The larger aggregate loss for 
Rule C occurs because sampling errors are large enough that the 95 percent lower 
confidence bounds are considerably below the point estimates. 



3.7 An Alternative Rtzle for Computing Disallowance — Rule D 

We describe here another rule, designated Rule D, which combines 
certain attractive charactmstics of Rules A and C, but mitigates certain imattractive 
characteristics from the points of view of the Federcd government and of the state. 

Disallowances as now computed by Rule A are subject to relatively 
large sampling errors in many states, even with the larger annual samples in use in 
the QC program in some states. These relatively large sampling errors can lead to 
substantially overstated or imderstated annual disallowances in a given year. 



3-19 



Table of Contents 



Chapter 3. CoHtiderutiona in Choict of Lowtr Confidence Bound Vtmu Point taHmatt in DtttrmJHiHg Ditallowances 



Table 3-4. Disallowances based on alternative rules, FY 1981 





Federal expenditure 


Disallowance 


State 


Rule A 


RuleB 


RuleC 


AL 


55,257,339 


46,527 








AZ 


18,204,168 


209,475* 


293,014* 


1,642» 


CA 


1,270,296772 


35,066,542 


35,066,542 


17,449396 


CO 


47,081,958 


1,898,203 


1398,203 


1,104,828 


cr 


102,601,922 


313/J38 








FL 


121,842,954 


3,467,041 


3,467341 


2,408,721 


HI 


46,619,415 


1,211,639 


1,211339 


283359 


ID 


14,481,785 


691,187 


691,187 


243,773 


IN 


83,266,989 


112,744 








KS 


47,251,492 


1,902,865 


1,902365 


1,174,489 


MD 


113,146,541 


1,325,172» 








ME 


40,439,640 


167,744 








MN 


134,920,297 


571,253 








NE 


27,006,307 


279,947 








NJ 


270,515,844 


1,279,810* 








NM 


32,394,291 


2,553,545 


2353345 


1,800304 


NY 


755,115,221 


6,269,722 








OH 


333,931,792 


3,930,043 








OK 


58,315,715 


1,508,394 


1308394 


526370 


SC 


56,158,502 


1,003,946* 


1,003,946* 


456359* 


SD 


11,866,284 


12,804 








TN 


59,079,920 


1,754,496 


1,754,496 


1,093,902 


TX 


87,575,3% 


1,112,295 


13%,127 


273375 


UT 


34319,580 


299,747* 








VT 


26,751,544 


225,194* 








WA 


118,607388 


4,161,714 


4,161,714 


1,750,039 


WY 


4,235,182 


412,782 


412,782 


324,958 


Totals 


3,971,284,738 


71,787,869 


57321,495 


28392,915 


Total, 










after waivers 




67,444,525 


56,024335 


28,434,713 



'Denotes that the disallowance was waived. 



Rule A: The current rule, based on die point estimate 

Rule B: Based on excess of point estimate over the target error rate, but only if the 95 percent lower confidence 

bound is above the target error rate. 
Rule C: Based on the excess of the 95 percent lower confidence bound over the target error rate 

Note: A somewhat different computation of the disallowance was done for the states AZ and TX than would 
result from a simple application of Rule A. The flguies for these states in the column headed "Rule A" 
reflect the disallowance as assessed rather than the disallowance computed by Rule A. On the other 
hand, the figures in the column headed "Rule B" are computed by Rule B, which for these states gives 
the same disallowance as obtained by Rule A. 



3-20 



Table of Contents 



WeaUt, Inc. 



The relative magnitude of these sampling errors is illustrated by the 
coefficients of variation shown in column 9 of Table 3-1. The limits of 95 percent 
confidence intervals would vary from sample to sample, but, on the average, would 
correspond to about two times the coefficients of variation shown in that table. For 
example, the standard error of R of .006 shown in column 2 is approximately 
illustrative of the standard errors in the states with the larger QC samples (a state 
sample of about 2400 and a Federal subsample of about 360). Column 9 shows that 
for such a large state, with an error rate of 5 percent (i.e., R - Rq = .02, with a target 
level of 1Rq = .030, and a sampling error of .006) the coefficient of variation of the 
estimated disallowance is .30. Consequently, for such a state, the bounds of the 
95 percent nominal confidence intervals would average between 60 percent above 
and 60 percent below the correct disallowance. About 5 percent of the time, the 
value being estimated will be either below or above the confidence interval. For a 
smaller state with a sampling error of .012, this range would be approximately 
doubled. These are relatively wide ranges due to sampling error. As seen from the 
table, they would be much larger for states with the same sampling errors, but with 
overpayment error rates closer to the 3 percent target, and of course would be 
considerably smaller for states with overpayment error rates considerably above the 
illustrated rate of 5 percent. 

From the point of view of the states, the problem of the large 
overestimates of disallowances that will occur in some years would be avoided by 
use of the lower confidence bound (i.e., RuleC) instead of the point estimate. 
However, as illustrated in column 12 of Table 3-1, and also in Table 3-4, with present 
annual sample sizes this would resiilt in large losses to the Federal government by 
consistently and substantially imderstating the disallowances that would be assessed 
if the true payment error rates were known. 

Another problem with Rule A is that disallowances are assessed only 
when the estimated error rate is above the target. Thus, because of sampling 
variation, a state may be assessed a disallowance when in fact the payment error rate 
is equal to or below the target rate. Moreover, since negative disallowances are not 
permitted by Rule A, such disallowances would not be compensated for over time. 
Consequently, a state that is at or near the target rate, above or below, would on the 
average be improperly assessed disallowances. A state whose error rate is 



3-21 



Table of Contents 



Chapter 3. ContideTationt in Choice of Lower Confidenct Bound Vtmu Point Estimmtt in DettrmiHing Di$ullowanct* 



moderately above the target rate would, on the average, be assessed a considerably 
larger disallov^ance than it v^ould be if the true error rate v*rere known. 

To eliminate or substantially reduce these problems, we have 
developed and have simulated the application of Rule D for computing 
disallowances. This rule accumulates the full disallowances across years, computed 
by Rule A except that negative total disallowances are allowed to accumulate on the 
books. It assesses an aimual cash disallowance on the basis of a lower confidence 
bound of the accumulated total disallowance. The final accumulated settlement is 
based on the accumulated disallowance based on the point estimates and is made 
when the relative sampling error (the coefficient of variation) of the accumulated 
total disallowance is acceptably small, say less than 10 to 15 percent. What is 
acceptably small is a policy decision. 

Convenient computation formulas are given in Appendix H. Over a 
few years, the application of Rule D greatly increases the effective sample size and 
greatly reduces the large annual fluctuations of disallowances due to sampling 
errors. Prior to a final settlement date, at which time the accumulated disallowance 
is based on the annual point estimates and a much larger sample, the Federal 
government recovers somewhat less in cash but avoids considerably overassessing 
some states each year. 

We note that under this procedure, the lower confidence botmd of the 
accumulated disallowance estimate for a given year, say year i, may be less than the 
lower confidence bound of the accumulated disallowance in the prior year, i-1. In 
this event, the Federal govenunent could pay the difference to the state. The toted 
accumulated disallowance would then remain the accumulation of the aimual 
disallowances. Alternatively, credit could be given against future disallowances. 
The choice is a policy decision. 

We note, also, that when the excess of the true error rate over the 
tolerance becomes small, say, less than one percent, the coefficient of variation of 
the accumulated disallowance remains large (above 10 or 15 percent) for many years, 
and a settlement would be long delayed. This is as it should be, because the amount 
of settlement in such an instance cannot be estimated acceptably from a sample of 



3-22 



Table of Contents 



Wtatat, Inc. 



any reasonable size, and therefore even after the sample is accumulated over a 
number of years. We also note, as will be seen later, that under Rule D, for states for 
which the sample is large and the excess of the overpayment error rate over the 
tolerance is also large, a cash settlement may be reached within two or three years or 
even annually. 

While the application of Rule D will result in considerable reduction 
initially in the cash withholding by the Federal government, a temporary cash loss 
may be acceptable for a few years in order to avoid substantially overassessing some 
states in individual years. Interest charges (or payments) might be introduced for 
the amounts carried on the books, in which event the disadvantage to the Federal 
government would appear to be reduced or removed. On a relative basis, the 
accimiulated disallowance based on the lower coi\fidence bound would approach, 
over a number of years, the full disallowance based on the point estimate. 

Table 3-5 illustrates the expected (average) consequences of applying 
Rule D to a state v^th an annual sampling error of .006, and also of .012, for a fixed 
aimual Federal payment of $100 million. It shows, for varying levels of the true 
error rate, the expected accumulated disallowances over a period of 1 to 16 years, 
based on Rule D, compared with those for Rules A and C. Appendix H describes the 
application of Rule D more fully, and it contains 16 illustrative examples of 
disallowances computed by Rules D and A, for successive years. The tables display 
random variations as they may occur in practice, for various values of the true 
overpayment error rate, and of the standard error of the estimates. 

It is seen from Table 3-5, and from Appendix H, that Rule D provides a 
compromise approach between Rule C and Rule A. In the first year, with Rule D, 
the cash disallowances are the same as for RuleC, although the balance of the full 
Rule A disallowance is recorded as an obligation available for offset in subsequent 
years. 

While the accumulatioiw are carried through 16 years in Table 3-5, they 
could be cut off after the estimated coefficient of variation becomes acceptably small 
and the acciunulation process would begin again. The accumulated settlement 



3-23 



Table of Contents 



Chapter 3. Considerttticmi in Chtrict of Lower Confidence Bound Vjrsm Point Estimate in Dtttrmininx THMllowancta 



Table 3-5. Expected accumulated disallowance compar . for Rules A, C, and D 







Accumulated measures 




Federal 


Standard 


Correct 


Exi 


pected disiillowance 








RuleD 
















ptayment 


error 


dis- 












CV 


R-Rg 


Year 


($1 mil.) 


ofR 


allo%vance 


Rule A 


RuleC 


Cash 


Book 


Total 


of total 


.05 


1 


100 


.0060 


5.0 


5.0 


4.0 


4.0 


1.0 


5.0 


.120 




2 


200 


.0042 


10.0 


10.0 


8.0 


8.6 


1.4 


10.0 


.085 




4 


400 


.0030 


20.0 


20.0 


16.1 


18.0 


2.0 


20.0 


.060 




8 


800 


.0021 


40.0 


40.0 


32.1 


37.2 


2.8 


40.0 


.042 




12 


uoo 


.0017 


60.0 


60.0 


48.2 


56.6 


3.4 


60.0 


.035 




16 


1.600 


.0015 


80.0 


80.0 


64.2 


76.1 


3.9 


80.0 


.030 


.05 


1 


100 


.0120 


5.0 


5.0 


3.0 


3.0 


2.0 


5.0 


.240 




2 


200 


.0085 


10.0 


10.0 


6.1 


7.2 


2.8 


10.0 


.170 




4 


400 


.0060 


20.0 


20.0 


12.1 


16.1 


3.9 


20.0 


.120 




8 


800 


.0042 


40.0 


40.0 


24.2 


34.4 


5.6 


40.0 


.085 




12 


1.200 


.0035 


60.0 


60.0 


36.3 


53.2 


6.8 


60.0 


.069 




16 


1.600 


.0030 


80.0 


80.0 


48.4 


72.1 


7.9 


80.0 


.060 


.03 


1 


100 


.0060 


3.0 


3.0 


2.0 


2.0 


1.0 


3.0 


.200 




2 


200 


.0042 


6.0 


6.0 


4.0 


4.6 


1.4 


6.0 


.141 




4 


400 


.0030 


12.0 


12.0 


8.1 


10.0 


2.0 


12.0 


.100 




8 


800 


.0021 


24.0 


24.0 


16.1 


21.2 


2.8 


24.0 


.071 




12 


1.200 


.0017 


36.0 


36.0 


24.2 


32.6 


3.4 


36.0 


.058 




16 


1.600 


.0015 


48.0 


48.0 


32.2 


44.1 


3.9 


48.0 


.050 


.03 


1 


100 


.0120 


3.0 


3.0 


1.1 


1.1 


1.9 


3.0 


.397 




2 


200 


.0083 


6.0 


6.0 


2.2 


3.2 


2.8 


6.0 


.283 




4 


400 


.0060 


12.0 


12.0 


4.5 


8.1 


3.9 


12.0 


.200 




8 


800 


.0042 


24.0 


24.0 


9.0 


18.4 


5.6 


24.0 


.141 




12 


1.200 


.0035 


36.0 


36.0 


13.5 


29.2 


6.8 


36.0 


.115 




16 


1.600 


.0030 


48.0 


48.0 


18.0 


40.1 


7.9 


48.0 


.100 


.01 


1 


100 


.0060 


1.0 


1.0 


0.2 


0.2 


0.8 


1.0 


.568 




2 


200 


.0042 


2.0 


2.0 


0.4 


0.7 


1.3 


2.0 


.420 




4 


400 


.0030 


4.0 


4.0 


0.9 


2.0 


2.0 


4.0 


.300 




8 


800 


.0021 


8.0 


8.1 


1.7 


5.2 


2.8 


8.0 


.212 




12 


1,200 


.0017 


12.0 


12.1 


2.6 


8.6 


3.4 


12.0 


.173 




16 


1.600 


.0015 


16.0 


16.2 


3.5 


12,1 


3.9 


16.0 


.150 


.01 


1 


100 


.0120 


1.0 


1.1 


0.1 


0.1 


1.0 


1.1 


.878 




2 


200 


.0085 


2.0 


2.3 


0.2 


0.3 


2.0 


2.3 


.652 




4 


400 


.0060 


4.0 


4.6 


0.4 


0,9 


3.3 


4.2 


.536 




8 


800 


.0042 


8.0 


9.1 


0.9 


2.8 


5.2 


8.0 


.420 




12 


1.200 


.0035 


12.0 


13.7 


1.3 


5.3 


6.7 


12.0 


.346 




16 


1.600 


.0030 


16.0 


18.2 


1.7 


8.1 


7.9 


16.0 


.300 



3-24 



Table of Contents 



Westat, Inc. 



would then be based on the accumulated results of the annual point estimates, and 
on a sample several times larger than the sample for a single year. The cut-off time 
would be extended more or less indefinitely for states with overpayment error rates 
near the target. Various modifications of Rule D could also be considered. 

An important consequence of applying Rule D is that, prior to final 
settlement, the accumulated cash disallowance and thus the cash disallowance 
assessed in each individual year is determined from a confidence interval computed 
from the much larger accumulated QC sample. At the time of final assessment of 
the full disallowances the samples are much larger than the annual samples. Such 
an approach substantially reduces the wide variability in annual disallowances that 
occurs due to sampling variability under present procedures, especially for states 
with error rates dose to the target or with small samples. This wide variability is 
illustrated, in detail, in the column headed "AFDC" of Tables H-1 through H-16 in 
Appendix H, giving the annual cash disallowance that would be assessed under the 
present rule. (Note that negative values in this column would, imder present rules, 
result in a zero disallowance.) 

Another consequence of Rule D is that it allows only a very low 
probability of assessing any cash disallowances against a state that is, in fact, meeting 
or near (above or below) the target payment error rate but which would often be 
assessed disallowances vaxdet the present procedure, due to sampling variability. 

We note that in the application of Rule D, there may be an unusually 
large Federal withholding in the year of a final settlement. If desired, this 
adjustment to the point estimate could be spread over two or three years to make a 
smoother series of disallowances. 

A question that arises is how to treat waivers in the application of 
Rule D. Waivers occur when, for various reasons, all or a part of the disallowance 
that would otherwise be assessed against a state for a particular year is waived. In 
Table 3-4 above, full waivers for 1981 were granted for six states. No specific 
question arose because all waivers were full waivers. With Rule D, as with the 
other procedures, the disallowance after a full waiver would be zero. The added 
accxmiulation for that year would then be zero. For a partial waiver, the nonwaived 



3-25 



Table of Contents 



Chapter 3. Con$ideratioHt in Choic* of Lototr ConfUenee Bound Vtnut Point Eatimala in DtterTmniHg Ditatlowanctt 



part of the disallowance would be accumulated. The computation of the estimated 
standard error would reflect appropriately whatever waiver was allowed. 

In Table 3-9, we illustrate computation of disallowances for each state 
by Rule D for the four fiscal years 1981 through 1984, the years for which 
information is currently available. Since waivers are available only for 1981, we 
have made the computations without waivers. 

We note that, because of some exceedingly high target rates for some 
states for 1981 (and to some extent for 1982, also) the results presented in Table 3-9 
provide a quite distorted picture from the application of Rule D. For example, 
Illinois has a target rate for 1981 of 12.7 percent. Its observed rate of 8.3 percent is still 
a high error rate. If Rule D were to be applied to Illinois beginiung in 1981, the state 
would receive an irutial book credit of 17.5 million dollars, to be credited against 
future disallowances. It seems highly undesirable to initiate Rule D for such a state, 
and more appropriate to initiate the rule for a state with a negative disallowance 
oiUy if the target for the state is below a specified level, for example, below 8 percent. 
Of course, the setting of this specific target level is a policy determination. If the 
specified target level for 1981 were set at 8 percent, then, of the 17 states with 1981 
target rates over 8 percent, only one (Maryland) with a 1981 target rate of more than 
8 percent has a 1981 observed overpayment rate above its target rate. 

In Table 3-6, we provide a summary of the aggregate results from the 
application of Rule D for two levels of the allowable 1981 target rate (8 percent and 
10 percent) for the initiation of Rule D, assuming that the application of Rule D 
begins in 1981. Excluded from these respective summaries are the 16 states with 
1981 target levels above 8 percent for which the computed disallowances are 
negative, and the 6 states with 1981 target levels above 10 percent for which the 
computed disallowances are negative (see Table 3-9). 



3-26 



Table of Contents 



Weatat, Inc. 



Table 3-6. Summary of aggregate disallowances from application of Rule D to eligible states,* 1981- 
1984 (thousands of dollars) 





Annual 


Accumulated 






(Rule A) 




(Rule D) 






Annual 


Cumulated 


Cash 


Book 


Total 




(000) 


(000) 


(000) 


(000) 


(000) 


Allowable target rate in 1981 












is 8 percent or less 












Total 1981 


70,837 


70,837 


28,901 


34,542 


63,443 


1982 


88,137 


158,974 


81,422 


63,576 


144,999 


1983 


119,836 


278,810 


179,908 


79,407 


259,315 


1984 


158,750 


437,560 


313,796 


102,723 


416,518 


Allowable target rate in 1981 












is 10 percent or less 












Total 1981 


70,837 


70,837 


28,901 


18,941 


47,842 


1982 


88,518 


159,355 


81,422 


41,268 


122,691 


1983 


124,755 


284,110 


179,908 


60,421 


240,329 


1984 


173,591 


457701 


320346 


91,092 


411,938 



'Eligible states are those that have 1981 target overpayment rates that are less than the allowable target, or that 
exceed the allowable target Init have a positive disallowance for 1981. 



3-27 



Table of Contents 



Chapter 3. Connderation* in Choice of Lower ConfUtne* Bound Vctwm Point E»tim*t» in Dtttrmjning Disallowanca 



Table 3-7 provides a summary of the additional disallowances that 
would be assessed for those states that would reach a full settlement some time 
during the four-year period for which data are available if an estimated 15 percent 
coefficient of variation were the criterion for settlement on the basis of the point 
estimate. Table 3-8 gives similar results if the criterion for a full settlement were an 
estimated coefficient of variation of 10 percent. 

The District of Columbia is not included in the summaries provided in 
Tables 3-7 and 3-8 because its target rate was 16.3 percent in 1983 with a negative 
computed disallowance. For D.C., Rule D would have been initiated in 1982 because 
the disallowance was then positive, and presumably a complete settlement would 
have been made for D.C. for each of the years 1982, 1983, and 1984 since its cv in each 
of these years was less than 10 percent. The total settlement for the three years 
would have been $9,743 thousand. 



Table 3-7. States reaching full settlement by or before 1984, if Rule D were initiated in 1981, and if a 
15 percent estimated cv were adopted as the criterion 





Full settlement at end of fiscal year 


Added settlement 








Amount 










State 


Year 


cv 


($000) 


This year 


Cumulative 


Arizona 


1983 


.14 


935 


2.4 


1.2 


Colorado 


1984 


.15 


1,207 


2.3 


0.6 


Florida 


1984 


.15 


2,364 


1.6 


0.5 


Michigan 


1983 


.15 


9,961 


1.8 


0.6 


(Mich.) 


1984 


.12 


1,658 


03 


0.3 


New Mexico 


1982 


.13 


935 


3.0 


1.5 


New York 


1983 


.15 


18,177 


2.1 


0.7 


S. Carolina 


1983 


.14 


1,107 


2.1 


0.7 


(S.C.) 


1984 


.11 


100 


0.2 


0.2 


Wyoming 


1981 


.13 


88 


2.1 


2.1 


Total 


36,532 





3-28 



Table of Contents 



Wtstat, Inc. 



Table 3-8. States reaching full settlement by or before 1984 if Rule D were initiated in 1981, and if a 
10 percent estimated cv were adopted as the criterion 





Full settlement at end of fiscal year 


Added settlemen 


t 

deral payment 




Year 


cv 


Amount 
($000) 


Percent of Fe 


State 


This year 


Cumulative 


Michigan 
S. Carolina 

Total 


1984 
1984 


.10 
.10 


11,619 
1,207 

12,826 


1.9 

2.2 


0.5 
0.6 



In summary, assuming the 8 percent 1981 target level, the total cash 
disallowance would be: 



Acoomulated total cash, 1981 through 1984, 
from Table 3-6 

Add cash from 10 complete settlements (Table 3-7) 

Add cash from complete settlements for D.C. 
(not included in Table 3-6) 

Total cash disallowances assessed over the 
four years 

Total accumulated on the book at the end of the 
four years (102,723 from Table 3-6, less the 
additional 36,532 from complete cash settlements) 

Total accumulated disallowances in four years, 
cash plus book 



Amount 
($000) 

$313,7% 
36,532 

9,743 

360,071 

66,191 
426,262 



Percent 
of total 



73.6 
8.6 



2.3 



84.5 



15.5 



100.0 



Due to possible minor differences from rounding, and especially 
because waivers are not available and used in the results presented, and perhaps 
because of other factors. Tables 3-6 through 3-9 may differ somewhat from the final 



3-29 



Table of Contents 



Chapter 3. Considerationa in Ckoict of Lower Confidence Bound Versus Point Eatitnatt in Determining Disallowanca 



determinations if Rule D were to be applied. Nevertheless, they provide satisfactory 
illustrations of the kinds of results that would occur from applying Rule D. 



3.8 Summary 

A primary purpose of the quality control program in AFDC is to 
measure the error rates and to identify likely causes of high rates, in order to guide 
corrective action. Another major purpose is the assessment of disallowances, based 
on QC estimates of overpayment error rates, in order to recover Federal funds that 
have been paid because of overpayment errors above target levels, as prescribed by 
law. The assessment of disallowances may also be an important factor in 
influencing states to improve their administration and procedures, and thus to 
reduce their error rates. The disallowances are currently computed aimually using 
point estimates. A number of states have presented arguments for fl\e use of lower 
confidence bounds in the assessment of disallowances because of the impact of 
sampling errors on the assessments. The statistical consequences of using the lower 
confidence bound versus the point estimate have been examined, and some 
alternative procedures for computing disallowance have been described. They make 
use of the point estimate, the lower confidence bound, or both, and one procedure 
accumulates the computations of disallowances over time in order to reduce the 
effect on the annual disallowance of large sampling errors. The statistical 
implications of the four alternatives have been examined in detail and illustrated 
with examples. 



3-30 



Table of Contents 








Table 3-9. Appli 


cation of Rule D to stat 


es 

RulA A Olsallowwioa * 
Annual CumulaMd 
(II potlllva) 






















1 


Annual Slallttic* 










RULE 


D 










STATl 


EYMf 


FadConlrIb 


Targal 


R-hal 


•.«. 


Annual Values | 




Caah 


Book 


Fed Conlrib 


R-hal 


sigma(D) 


Cash 


Book 


Total 


cv 


AK 


1981 


17,163,771 


.221241 


0.18189 


0.02625 


■675,412 








-675.412 


17,163,771 


0.1819 


450,549 





675,412 


-675.412 


0.67 




1982 


16.140.020 


.130621 


0.12086 


0.01871 


-157,543 








-157.543 


33,303,791 


0.1523 


542,389 





-832,954 


832,954 


065 




1983 


15.019.616 


.040000 


0.15496 


0.02126 


1.726.655 


1,726,655 





1.726.655 


48,323,407 


0.1531 


629,404 





893.701 


893,701 


0.70 




1984 


18.670.392 


.030000 


0.06827 


0.01388 


714.516 


2.441,171 


488.522 


225.994 


66,993,799 


0.1295 


680.666 


488.522 


1,119,695 


1.608.217 


0.42 


AL 


1981 


55.257.339 


.076399 


0.07724 


0.00816 


48.471 


46.471 





46,471 


55.257.339 


0.0772 


450,900 





46.471 


46,471 


>1 00 




1982 


51.180.010 


.058200 


0.05293 


0.00682 


-268.771 


46.471 





-269,771 


106.447.349 


0.0655 


570,257 





-223.300 


-223,300 


>1.00 




1983 


52.044.121 


.040000 


0.03158 


0.00475 


-438,211 


46.471 





-438.211 


158.481.470 


0.0544 


621,535 





-661.511 


661,511 


0.94 




1984 


52.634,574 


.030000 


0.04363 


0.00841 


717,408 


763.881 





717.409 


211.126.044 


0.0517 


763,053 





55.888 


55,888 


>1.00 


AR 


1981 


37.208.159 


.074268 


0.06788 


0.00647 


-237,686 








-237.686 


37.208.159 


0.0679 


240,737 





-237,686 


-237,686 


>1.00 




1882 


24.586.499 


.057134 


0.07027 


0.00800 


322,868 


322.868 





322.868 


61.784.658 


0.0688 


310.873 





85.283 


85,283 


>1.00 




1883 


24.866.313 


.040000 


0.04856 


0.00721 


212.856 


535.824 





212.856 


86.660.971 


0.0630 


358.867 





298.138 


298,138 


>1 00 




1884 


28,755.157 


.030000 


0.03602 


0.00683 


230.616 


766.440 





230.616 


115.416.128 


0.0568 


410.482 





528.755 


528,755 


0.78 


AZ 


1981 


18.204,168 


.066681 


0.08278 


0.00873 


283.069 


283.069 


1,696 


281.373 


18.204.168 


0.0828 


177.127 


1.696 


281.373 


293,069 


0.60 




1982 


21,336,453 


.053341 


0.11603 


0.01054 


1,337,561 


1.630.630 


1.158.028 


179.533 


38.540,621 


0.1007 


286.265 


1.159.724 


4 70.906 


1,630,630 


0.18 




1883 


39,230,909 


.040000 


0.10030 


0.01251 


2,365,624 


3.006.254 


1.901.898 


463.725 


78,771,530 


0.1005 


568.165 


3,061.622 


934,631 


3,996,254 


0.14 




1884 


42.7sa.soa 


.030000 


0.086SS 


0.01 1 74 


2,846,026 


8.842.280 


2.533.487 


312.539 


121.530,338 


0.0891 


758.158 


5,585,110 


1,24 7,170 


6,842,280 


0.11 


CA 


1981 


1,270.296.772 


.040000 


0.06761 


0.00843 


35,072,894 


35.072.884 


17.457.244 


17,615.650 


1.270,286,772 


0.0676 


10.708.602 


17.457,244 


17,615,650 


35.072,894 


0.31 




1982 


1,366,989,822 


.040000 


0.06001 


0.00790 


27,353,466 


62.426.360 


18.851.187 


7.402.269 


2.637.286,584 


0.0637 


15.208.461 


37,408,441 


25,017.919 


62,426,360 


024 




1983 


1,493,164,856 


.040000 


0.04806 


0.00560 


1 2,034,909 


74,461.268 


8.502.820 


3.531.989 


4.130,451,450 


0.0580 


17.355.567 


45.811,361 


28.549.908 


74,461,269 


0.23 




1984 


1,586,346,359 


.030000 


0.05177 


0.00796 


34,534,760 


108.896.029 


27.777.862 


6.756.899 


5,716,787,809 


0.0563 


21.463.104 


73.688.223 


35.306.807 


10»,g96,028 


0.20 


GO 


1981 


47,081,858 


.042135 


0.08245 


0.01024 


1,888.109 


1.898.109 


1,105.023 


793.086 


47.081.858 


0.0825 


482.119 


1.105.023 


793.086 


1,898,109 


25 




1982 


45,283,369 


.041067 


0.06603 


0.00697 


1,130.409 


3.028.518 


875,572 


154.837 


82,365.327 


0.0744 


576.245 


2.080,585 


94 7.923 


3,028,518 


0.18 




1983 


51,766,123 


.040000 


0.06223 


0.00673 


1,150,761 


4.179.278 


890,985 


159.775 


144.131.450 


0.0700 


673.373 


3,071.580 


1.107.698 


4,178,279 


0.16 




1984 


53.629.580 


.030000 


0.04618 


0.00544 


867,727 


5,047,005 


768,230 


99.496 


197.761.030 


0.0636 


733.857 


3.839.811 


1.207.195 


5,047,005 


0.15 


CT 


1881 


102,601.822 


.070850 


0.07400 


0.00401 


312,836 


312,936 





312.936 


102.601,922 


0.0740 


411.434 





312.9.-46 


312,936 


>1.00 




1982 


105.087,773 


.055475 


0.06360 


0.00802 


853,919 


1.166,855 





853.919 


207.699,695 


0.0687 


1.033.415 





1.166.855 


1,166,855 


0.89 




1983 


108.706,080 


.040000 


0.04401 


0.00422 


435,911 


1,602,767 





435,911 


316,405,775 


0.0602 


1.130.659 





1.602.767 


1,602,767 


0.71 




1984 


1 1 1 .930.469 


.030000 


0.03383 


0.00456 


439,922 


2,042,688 


1,888 


437.924 


428,345,240 


0.0534 


1,240,541 


1.998 


2.040.690 


2,042,688 


0.61 


DC 


1981 


44.362.691 


.162980 


0.13564 


0.00846 


-1,212,876 








-1,212.676 


44,362,681 


0.1356 


419,671 





1.212.876 


1,212,876 


0.35 




1982 


43,215.977 


.101490 


0.17123 


0.01282 


3,013,882 


3,013,882 


657.676 


2,356.206 


87,578,668 


0.1532 


695,034 


657.676 


1,143.330 


1,801,006 


0.39 




1983 


40.036,548 


.040000 


0.13150 


0.01316 


3,663,344 


6,677,226 


3.371.960 


281.384 


127,615,217 


0.1464 


872,187 


4,029,636 


1.434,715 


5,464,350 


0.16 




1984 


37.300,887 


.030000 


0.11219 


0.01 038 


3,065.760 


8,742,986 


2.930.739 


135.021 


164,816,104 


0.1387 


954.246 


6,960,375 


1,569,735 


8,530,110 


0.11 


DE 


1981 


16.034.496 


.120495 


0.11276 


0.01705 


-124,027 








-124,027 


16.034,496 


0.1128 


273.388 





-124,027 


-124,027 


>1.00 




1982 


14,158.437 


.060248 


0.11875 


0.02287 


545,128 


545,128 





545,128 


30.192.933 


0.1156 


423.780 





421,101 


421.101 


>1.00 




1983 


13.617,760 


.040000 


0.09371 


0.01596 


731,410 


1.276.538 


369.059 


362,351 


43.810,893 


0.1088 


476.263 


369,059 


783,452 


1.152.511 


0.41 




1984 


13,785,238 


.030000 


0.07791 


0.01637 


660,451 


1.036.988 


576.954 


83,496 


57.595,931 


0.1014 


527,020 


946.013 


866,949 


1.812,962 


0.29 


FL 


1981 


121,842,854 


.050788 


0.07925 


0.00528 


3.466.676 


3.466.676 


2,408,397 


1,058,279 


121,842,954 


0,0793 


643,331 


2.408,397 


1,058,279 


3,466,676 


19 




1982 


118.632,382 


.045399 


0.06030 


0.00543 


1,782,642 


5.249.318 


1,336.974 


445,668 


241,475,336 


0.0699 


914,254 


3,745,371 


1,503.947 


5,249,318 


17 



Table 3-9. Application of Rule D to states (continued) 



Table of Contents 



GA 



HI 



lA 



ID 



IN 



KS 



KY 



LA 







Annual SwUtllcs 




Rula A Oi&aUowwoa* 
Annual Cumulalad 
(11 poslllva) 








RULE 


O 










Yaar 


Fad Conlrib 


Targal 


R-hal 


s.a. 


Annu^ Valuss | 




Cash 


Bonk 


Fad Conlrib 


R-hal 


sigma(D) 


Cash 


Rook 


Total 


cv 


1983 


138.762,474 


.040000 


0.04523 


000353 


725,728 


5,975,048 


523,472 


202,255 


380,237,810 


0609 


1,037,205 


4.268,843 


1.706,203 


5,975.046 


17 


1884 


144,962.663 


.030000 


0.05354 


0.00686 


3.412,421 


9,387,467 


2,754,905 


657,518 


525,200,473 


0.0588 


1,436,911 


7.023,748 


2.363.719 


9.387,467 


15 


1S81 


105,505.310 


.065285 


0.06517 


0.00497 


■12,133 








12,133 


105,505.310 


0.0652 


524.361 





•12,133 


12,133 


>1.00 


1982 


113.875.476 


.052A42 


0.05143 


0.0074* 


-138,138 








-138,138 


219.480.786 


0.0580 


998,945 





150,271 


150,271 


>1.00 


1883 


12S.472.468 


.040000 


0.05729 


0.0048* 


2,164,400 


2,164.400 


86,732 


2,077,668 


344.953.254 


0.0577 


1,171,670 


86,732 


1,927,397 


2,014,129 


058 


1884 


133,858.211 


.030000 


0.06177 


0.00*14 


4,252.875 


6.417.075 


3,825.760 


426,915 


478.811.465 


0.0589 


1,431,193 


3,912,492 


2,354,312 


6,266,804 


023 


1981 


46,619.415 


.074989 


0.10098 


0.01210 


1.211,685 


1,211.685 


283,749 


927,936 


46.618.415 


0.1010 


564,095 


283,749 


927.938 


1,211.685 


047 


1982 


43,937.529 


.057495 


0.08217 


0.0141* 


1,084,159 


2.295.844 


629,536 


454,622 


80.556.944 


0.0919 


840,461 


913,285 


1,382.558 


2,285.844 


037 


1883 


43.207.OaO 


.040000 


0.06901 


0.0125* 


1,253,437 


3.549.281 


989.503 


263,835 


133.764.024 


0.0845 


1,000,908 


1,902,788 


1,646.493 


3,548.261 


0.28 


1884 


41.402,140 


.030000 


0.06653 


0.01231 


1,512,420 


5.061.701 


1,311,255 


201,165 


175.168.164 


0.0802 


1,123,196 


3,214,044 


1,847,658 


5,061.701 


0.22 


1981 


83,979,881 


.065241 


0.04260 


0.00440 


-1,901,388 








-1,801.388 


83.879.861 


0.0426 


369,511 





1,901,388 


1,801,388 


0.19 


1982 


70,26e.42« 


.052620 


0.04492 


0.00531 


-541,052 








-541.052 


154,246.310 


0.0437 


525.122 





-2,442.440 


-2.442.440 


021 


1883 


80.125.076 


.040000 


0.03430 


0.00550 


-458,713 








-456.713 


234.371.386 


0.0405 


685.536 





-2,899.153 


-2.809,153 


024 


1984 


87.801.937 


.030000 


0.03662 


0.00472 


581,248 


581.249 





581,248 


322.173.323 


0.0394 


801.066 





-2,317,904 


■2,317,904 


35 


1981 


14,481.785 


.042925 


0.09065 


0.01878 


691,143 


691.143 


243.756 


447,387 


14.481,785 


0.0907 


271,968 


243,756 


447,387 


691.143 


039 


1882 


13.153.079 


.041462 


0.05430 


0.01167 


168,859 


880.002 


102.522 


66,337 


27.634.864 


0.0733 


312,294 


346,278 


513.724 


860,002 


36 


1883 


14.024.312 


.040000 


0.02977 


0.00875 


-143,469 


860.002 


-190,554 


47,085 


41,858.178 


0.0587 


340.917 


155,724 


560.809 


716.534 


048 


1884 


13.849,779 


.030000 


0.09495 


0.01824 


899.543 


1.759.546 


748.558 


150,885 


55.508.955 


0.0677 


432,702 


904,282 


711.794 


1,618,077 


27 


1981 


390.914.682 


.127286 


0.08254 


0.00703 


-17,491,868 








-17.481.868 


390.814.682 


0.0825 


2,748,130 





-17,491,868 


■17,4«1,868 


0.16 


1982 


401.104.833 


.082430 


0.08243 


0.00907 














792,019,515 


0.0825 


4,559,322 





-17.491,868 


■17,491,868 


0.26 


1983 


411.830.534 


.040000 


0.06816 


0.00782 


11,597,148 


11.597.148 





11,587,148 


1,203,850,049 


0.0776 


5,582,036 





-5,884,721 


5,894,721 


095 


1984 


421.853.209 


.030000 


0.06497 


0.00667 


14,752,207 


26.349.355 





14,752,207 


1,625,703.258 


0.0743 


6,251,110 





8.857,486 


8,857,486 


0.71 


1981 


83.266,988 


.040000 


0.04135 


0.00525 


112,410 


112.410 





112,410 


83,286,989 


0.0414 


437,152 





112,410 


112,410 


>1.00 


1882 


78.402.824 


.040000 


0.03856 


0.0048* 


-112,900 


112,410 





-112.900 


161,669,813 


0.0400 


579,906 





-490 


■490 


>1 00 


1883 


82.652.3a4 


.040000 


0.04852 


0.00474 


704,198 


816.609 





704.198 


244.322.197 


0.0429 


699,841 





703,709 


703.709 


099 


1884 


91,286.467 


.030000 


0.03963 


0.00304 


879,089 


1,695.697 


344,353 


534.736 


335.608,6*4 


0.0420 


752.854 


344,353 


1,238,445 


1,582,797 


048 


1881 


47.251.492 


.040903 


0.08117 


0.00937 


1,902,678 


1,902.676 


1,174,358 


728.318 


47,251,492 


0.0812 


442,746 


1,174,358 


728,318 


1,902.676 


23 


1882 


42,607.920 


.040452 


0.02813 


0.00627 


-525,015 


1,902.676 


-647,330 


122.315 


89,859,412 


0.0560 


517,102 


527,028 


850,633 


1,377,661 


0.38 


1983 


47.801.184 


.040000 


0.05111 


0.00*25 


531,071 


2,433.747 


311,930 


219,141 


137,860.606 


0.0543 


650.318 


838,958 


1,069,774 


1,908.732 


034 


1884 


43.057.502 


.030000 


0.05489 


0.009*3 


1,094.102 


3.527.849 


887,430 


206,672 


181,818.108 


0.0545 


775.955 


1,726,389 


1,276,446 


3,002.835 


0.26 


1981 


99,638.877 


.081195 


0.04974 


0.00431 


-3.134,141 








-3,134,141 


98,838.877 


0.0497 


429.444 





-3,134,141 


■3.134,141 


0.14 


1982 


83.326.419 


.060597 


0.03578 


0.0O421 


-2,069,578 








-2.069.578 


182.865.296 


0.0434 


554,514 





-5,203,718 


■5,203,719 


0.11 


1983 


86.117.601 


.040000 


0.03420 


0.00396 


-499,482 








-499.482 


268,082,897 


0.0404 


650,987 





-5,703,201 


-5,703,201 


oil 


1984 


95.131.090 


.030000 


0.04146 


0.0O409 


1,090,202 


1,090,202 





1,090,202 


364,213.996 


0.0407 


758.401 





-4.612,999 


4,612,999 


0.16 


1981 


89,792,909 


.087025 


0.06705 


0.00600 


-1,793.613 








1,793,613 


89,792,909 


0.0671 


538,757 





-1,793,613 


•1,793,613 


030 


1982 


85,012,672 


.063512 


0.06163 


0.00636 


-159.994 








-159,994 


174,805,581 


0644 


763.279 





1,953,607 


1,953,607 


39 


1983 


80.125,076 


.040000 


0.05675 


0.00699 


1,342.095 


1,342,095 





1,342,085 


254,930,857 


00620 


846,720 





-611,512 


611,512 


>1 00 


1984 


83.291.207 


.030000 


0.05793 


0.00597 


2,605,344 


3,947,439 


186.973 


2,418.371 


348,221,864 


00609 


1.098,394 


186,973 


1,806,859 


1,993.831 


55 



9'1 9'U6 



Table of Contents 



Table 3-9. Application of Rule D to states (continued) 



Annual SMtlsllcs 



STATE Ymt Fad Contrib 



Targ*! 



R-hat 



J Rut« A DlaallowanoiA 

Annual Cumulalsd 

(It posltlva) 



R U LED 



Annual Valuas 



CaEh 



Book 



Fad Conlrib R-hal slgmatO) 



Cash 



Book 



Tola! 



tJE 



Ml 



MA 1981 266.657.338 .119126 

1882 250.848.885 .078563 

1883 223,000,695 .040000 

1884 203.387,441 .030000 

MD 1881 113.148.541 .103815 

1882 108,521,840 .071907 

1883 112,256,768 .040000 

1884 114,551.324 .030000 

40,429,640 .074661 

41,341.291 .057330 

44.763,143 .040000 

48.837,1 75 .030000 

549,635.657 .074685 

532,150.982 .057343 

566,088,345 .040000 

615,275,503 .030000 

134.920,297 .040000 

127,746.141 .040000 

140,175,501 .040000 

151,030,687 .030000 

116,840,386 .080665 

105,937,011 .060332 

113,032.830 .040000 

1 20.007.650 .030000 

MS 1981 48, 171,208 .090413 

1882 42.74S,1»5 .065207 

1883 43,781,804 .040000 
1964 44,672,252 .090000 



0.08260 0.00839 

0.07362 0.00613 

0.11434 0.01028 

0.07757 0.00741 



-7.073,353 

-1,480,783 

16,577,872 16,577.872 2,168.748 

9,674,189 26,252,061 9.170,239 



1881 
1882 
1883 
1984 

1881 
1882 
1883 
1884 

1881 
1882 
1883 
1984 



MO 1881 
1882 
1983 
1984 



0.11553 


0.00741 


1,325,512 


1,325,512 





0.08218 


0.00778 


1.094,299 


2,418,811 


480.558 


0.05270 


0.00424 


1,425,661 


3,845,472 


1,273,565 


0.05665 


0.00800 


3.052,783 


8.888,264 


2,768,739 


0.07881 


0.00830 


167,743 


167.743 





0.04095 


0.OOB91 


-677,170 


167,743 





0.04548 


0.00776 


245,302 


413,045 





0.04144 


0.00950 


558,697 


871,742 






0.07284 0.00755 

0.08235 0.00502 

0.00144 0.00545 

0.08011 0.00591 

0.04423 0.00795 

0.03028 0.00709 

0.02587 0.00388 

0.02014 0.00314 

0.07065 0.00674 

0.04772 0.00596 

0.03431 0.00382 

0.03709 0.00524 

0.06909 0.00669 

0.04736 0.00737 

0.03491 0.00744 

0.02027 0.00317 



-1,014,078 

13,307,500 13,307,500 3,722,826 

28.119,584 42,427,084 27,728,658 

30.831,455 73,258,540 20.173.352 



570,713 
-1,241.692 
-2.008,715 
-1,489,163 

-1.146,788 

-1.338,078 

-643,157 

850,854 

-1,027.155 
-762,019 
-222,848 
-434,661 



570.713 
570.713 
570,713 
570.713 







850,854 








-7,073,353 286,657,338 

-1.490,783 517,504,223 

14.408,124 740.504.818 

503,050 943,872,358 

1,325,512 113,146,541 

613,741 218,666.381 

152,086 331,825,148 

286,054 446,476,473 

167,743 40,428,640 

-877,170 81,770,931 

245,302 128,534,074 

558.687 175,371,249 



0.0826 2,237,255 

0.0834 2.714,738 

0.0927 3,553,184 

0.0895 3,850,537 

0.1155 838,416 

0.0884 1,178,877 

0.0836 1.271,337 

0.0767 1,445,230 



0.0788 
0.0597 
0.0546 
0.0510 



375,996 
448,407 
567,211 
732,790 



-1,014,078 549.635,657 0.0728 4,148,749 

9,584,673 1.081.786,638 0.0775 5,210,088 

1,388,826 1,647,874,884 0.0823 6,055,028 

1.658,104 2,263,150,487 0.0817 7,062.894 



570,713 134.920.287 

-1,241.682 262,666,438 

-2,008,715 402,841,838 

-1,488,163 553.872,626 

-1,146,788 116,840,385 

-1,336,078 222,777,386 

-643.157 335,810,235 

850,854 455.817,785 



0.0442 1.072,616 

0.0374 1.403,864 

0.0333 1,508,044 

0.0297 1,578,845 

0.0708 787,504 

0.0588 1 ,008,361 

0.0513 1,087,838 

0.0475 1,265,183 



-1,027,155 48,171,208 0.0691 331,900 

-762,019 90,918,403 0.0588 457,805 

-222,848 134.688.207 0.0511 561,700 

-434.661 178,370.458 0.0434 579,276 







2.168,748 

11,338,887 



480.558 

1,754.123 

4,520,861 








3.722.826 
31,452,485 
60.625.836 


















-7.073.353 

-8,564,136 

5,844.988 

6,348,938 

1,325.512 
1,939.253 
2.081,349 
2,377,403 

167,743 
-508,428 
-284,126 

284,572 

-1,014,078 
8.570,585 
8,860,521 

11.818,625 

570,713 

-670,080 

-2,679,685 

-4,168,857 

-1,146,788 
-2,482,866 
-3,126,023 
-2,275,160 

-1,027,155 

1.788,173 

-2,012,023 

-2,446,684 



-7,073.353 0.32 

-8,564.136 0.32 

8.013,736 0.44 

17.687,925 0.22 

1.325,512 0.63 

2,418,811 0.49 

3.845,472 0.33 

6.888,264 0.21 

167,743 >1 00 

-509,428 0.88 

-284,126 >1.00 

294,572 >1.00 

-1.014.078 >1.00 

12,293,422 0.42 

41.413,006 0.15 

72,244,462 0.10 

570,713 >1.00 

-670.980 >1.00 

-2,079,695 0.56 

-4,168,857 0.38 

-1,146,788 0.69 

-2,482,866 0.41 

3,126,023 0.35 

-2,275,168 0.56 

1,027,155 0.32 

-1,789,173 26 

2,012.023 0.28 

-2.446,684 0.24 



MT 1881 
1982 
1983 
1984 

NC 1881 
1982 
1983 
1884 



12.019,670 .078326 

12,363,665 .050163 

15,404,757 .040000 

17,468.216 .030000 



106,567,740 

96,970,614 

103,724,778 

102,884,223 



.066062 
.053031 
.040000 
.030000 



0.04023 0.01076 

0.02S47 0.00813 

0.02456 0.00014 

0.06010 0.01515 



0.05420 
0.03281 
0.02684 
0.03485 



0.00418 
0.00337 
0.00379 
0.00437 



-349,724 

-416,569 

-238,230 

883,007 

-1,264,107 

-1,960,847 

-1,385,018 

488,473 









683,007 







499.473 



-349,724 12,019,670 0.0492 129,572 

-416,568 24,383.335 0.0372 171,845 

-239,239 39,878,082 0.0323 222,683 

683,007 57.346,308 0.0435 345,867 

-1,264,107 106,567,740 0.0542 445.453 

-1,060,847 203,538,554 0.0440 552,468 

-1,365,018 307.263,332 0.0382 678,058 

489,473 410,247,555 0.0374 813,818 



349,724 

-766,283 

-1,005.532 

-322,525 

■1.264,107 
-3. 224. 853 
-4,580,971 
-4,090.498 



-349.724 0.37 

-766,283 0.22 

-1,005.532 0.22 

-322,525 >1.00 



-1,264,107 
-3.224,053 
4,589.971 
-4.090,488 



0.35 
0.17 
0.15 
20 



to 1981 



8,854.396 .040000 0.03088 0.00884 



9,774 



-89,774 



9,854,398 0.0309 



87.113 



■89.774 



-89,774 97 



9/19/86 



Table of Contents 



Table 3-9. Application of Rule D to states (continued) 





1 
EYmt 


Fad Conlrlb 


Annual Slallttics 
Targal R-hat 


i.a 


Rula A DIsallowancs* 
Annual Cumulalad 
(If poslllva) 








RULE 













STAT 


Annual Viduaa | 




Cwn 


Book 


FadConlrtb 


R-hal 


slgma(O) 


Cash 


Rnnk 


Tolal 


cv 




1982 


8,921,303 


.040000 


0.01909 


0.00547 


-186.544 








-186.544 


18,775,701 


0.0253 


99,850 





276,318 


-276.318 


36 




1B83 


9.248,940 


.040000 


0.02071 


000741 


■178.412 








-178,412 


28,024,841 


0.0238 


121.108 





■454.730 


454.730 


27 




1B84 


9,718.794 


030000 


0.04887 


0.01375 


163.956 


163,856 





163.856 


37.743.435 


0.0297 


180.347 





290,774 


-290,774 


062 


^€ 


1981 


27.006.307 


.044331 


0.05470 


0.01241 


280.028 


280,028 





280,026 


27.006.307 


0.0547 


335.148 





280,028 


280.028 


>1.00 




1982 


28.287.670 


.042166 


0.00594 


0.01656 


1.521.141 


1.801,170 


853.867 


667,474 


55.203.877 


0.0758 


575.990 


853,667 


947,503 


1.801.170 


032 




1983 


31.391.923 


.040000 


0.04679 


0.00804 


213,151 


2,014.321 


104.394 


106,757 


86,665.800 


00653 


642.103 


858,061 


1,056,260 


2,014,321 


0.32 




1984 


32,168,888 


.030000 


0.066M 


0.01252 


1,253.300 


3.267.621 


1.062.711 


180,580 


118,854,768 


0.0663 


757.963 


2,020,771 


1,246,849 


3,267.621 


023 


MH 


1881 


16,882,118 


.066656 


0.06569 


0.01241 


-350.574 








-350.574 


16,662.118 


0.0659 


209.507 





-350,574 


-350,574 


60 




1982 


14,571,771 


.069326 


0.056SS 


0.01130 


-69,624 








-68,624 


31,453,888 


0.0625 


266,470 





-420,198 


-420.198 


0.63 




1983 


14.073,658 


.040000 


0.04340 


0.00722 


47,650 


47.850 





47,850 


45.527,547 


0.0566 


285.167 





-372.348 


-372.348 


0.77 




1984 


12,883,437 


.030000 


0.07522 


0.01513 


582,568 


630,438 





582,588 


58.410,884 


0.0607 


345.438 





210.241 


210,241 


>1.00 


NJ 


1981 


270.51 5.844 


.075481 


0.06021 


0.00764 


1,270,269 


1,279,269 





1,279,268 


270.515.844 


0.0802 


2.066,741 





1,279,269 


1,279,269 


>1 00 




1982 


256,603,833 


.057740 


0.07341 


0.00747 


4,020,064 


5.300.253 


663.319 


3.357,665 


527.118.777 


0.0760 


2.818.805 


663.318 


4.636,934 


5,300,253 


053 




1983 


248,058,007 


.040000 


0.06364 


0.00576 


5,885.367 


11.165,620 


5,319.831 


565.536 


776,077.784 


0.0726 


3.162.586 


5,983,150 


5,202,470 


11.185,620 


0.28 




1984 


245,446,764 


.030000 


0.05 1M 


0.00640 


5,228,016 


16,413.636 


4,621.607 


606.409 


1.021.524.548 


0.0675 


3.531,234 


10,604,757 


5,808.879 


16,413.636 


022 


MM 


1981 


32,394,291 


.045037 


0.12386 


0.01413 


2,553,415 


2.553.415 


1.800.447 


752,968 


32.304.281 


0.1239 


457.731 


1.800,447 


752.968 


2.553,415 


0.18 




1982 


30,773,114 


.042510 


0.10524 


0.01095 


1,930,120 


4.483,536 


1.748.092 


182.029 


63.167.405 


0.1148 


568,387 


3.548,538 


934.997 


4,483,536 


0.13 




1983 


29.860.817 


.040000 


0.06025 


0.01031 


604.864 


5.088.388 


476.445 


128.419 


93.037.222 


0.0973 


646.453 


4,024,984 


1,063,416 


5,088.399 


0.13 


u 


1984 


34.686,013 


.030000 


0.05014 


0.00787 


1,010,750 


6.099.150 


919.827 


90.924 


127.723,235 


0.0869 


701.726 


4.944.811 


1,154.339 


6.099.150 


0.12 


" NV 


1881 


6.196,357 


.040000 


0.02260 


0.00674 


-107,817 








•107.817 


6,106,357 


0.0226 


41.763 





-107,817 


-107,817 


0.39 




1982 


6.023,800 


.040000 


0.01 255 


0.00590 


-165.353 








-165,353 


12.220,157 


0.01 76 


54.839 





-273,170 


•273.170 


0.20 




1983 


5,434,216 


.040000 


0.02601 


0.00880 


-71.134 








-71.134 


17,654,375 


0.0205 


72.761 





-344,304 


344.304 


0.21 




19B4 


5,064,327 


.030000 


0.02081 


0.00873 


-46.725 








-46,725 


22,738,702 


0.0206 


87.886 





-391,029 


-391.029 


0.23 


NY 


1981 


755.115,221 


.071713 


0.06002 


0.00626 


6.272,742 


6,272,742 





6.272,742 


755,115,221 


0.0800 


4.727.021 





6,272,742 


6,272,742 


75 




1982 


835,083,462 


.055656 


0.07056 


0.00706 


18,811,520 


26.084,262 


13,632.025 


6,170,485 


1,580.198.663 


0.0708 


7,569,749 


13.632.025 


12,452,237 


26,084.262 


0.29 




1983 


883,636,254 


.040000 


0.00361 


0.00811 


47,548,467 


73,632,729 


41.823.460 


5,725.007 


2.473,834.837 


0.0846 


11.049.996 


55,455.486 


16,177,243 


73.632.729 


0.15 




1984 


957.340,305 


.030000 


0.07114 


0.00794 


38,384.080 


113,017,708 


35,499,474 


3,685,506 


3,431,175,242 


0.0610 


13,412.006 


90.954,960 


22,062,749 


113.017.709 


0.12 


OH 


1881 


333,931 ,792 


.076891 


0.08886 


0.00835 


3,030.043 


3,930.043 





3.930.043 


333,931,792 


0.0887 


2.788.330 





3,930.043 


3.930.043 


0.71 




1982 


334.115,763 


.056446 


0.07600 


0.00821 


5.605.130 


8.825.182 


3,390,870 


2.504.269 


666,047,555 


0.0824 


3.811.436 


3,390.670 


6,434.312 


9.825.182 


0.40 




1983 


359,726,189 


.040000 


0.05600 


0.00535 


5,767,904 


15.613.176 


5,051,320 


736.674 


1,027,773,744 


0.0732 


4.358.262 


8.442.190 


7,170,966 


15.613.176 


028 




1984 


401.626,269 


.030000 


0.06365 


0.00475 


13,601.819 


29,214,995 


12,844,575 


657.245 


1,429,600,013 


0.0706 


4.756.803 


21,386.764 


7,828,231 


29.214.995 


16 


OK 


1981 


56,315,715 


.040000 


0.06687 


0.01023 


1,506,628 


1,508,628 


527,270 


861.357 


58,315,715 


0.0658 


596.570 


527.270 


881,357 


1.508,628 


0.40 




1882 


44,318.866 


.040000 


0.03813 


0.00664 


-62,676 


1.506.628 


-195.777 


112.801 


102,634,581 


0.0539 


665.203 


331.493 


1,094.258 


1,425,751 


0.47 




1983 


46,160,858 


.040000 


0.04051 


0.00850 


23.547 


1.532.174 


-156.179 


179,725 


148.804,439 


0.0497 


774,458 


175,314 


1,273,984 


1,449,298 


0.53 




1984 


49,398.453 


.030000 


0.03021 


0.00497 


10.374 


1.542.548 


-52,108 


62,482 


186,202.882 


0.0449 


812.441 


123,206 


1,336,466 


1,459,672 


56 


OH 


1881 


61.574.104 


.098113 


006772 


0.01097 


-1.871,422 








-1,871.422 


61.574.104 


00677 


675,468 





-1.871.422 


■1.871.422 


36 




1982 


52.861.730 


.068056 


0.07068 


0.01088 


86,408 


86,408 





86,409 


114,455,834 


00691 


887.293 





-1,785,013 


■1,785,013 


50 




1983 


52,844.417 


.040000 


0.05983 


0.00889 


1.047,905 


1.134,314 





1.047,905 


167,300,251 


0.0662 


1.029,773 





737,108 


-737,108 


>1 00 



Lj 



Table of Contents 



Table 3-9. Application of Rule D to states (continued) 





1 
EYmt 


Fad Conlrib 


Annual SMtlsllcs 
Target R-hat ' 


1 


Rule A OkalliMvancs * 
Annual Cumulatad 
(II poslllva) 








RULE 


D 










STATI 


Annual Valuas | 


















Cash 


Hook 


Fad Conlrib 


R-hat 


Blgma(D) 


CaEti 


Book 


Total 


cv 




19B4 


57.654.583 


.030000 


0.04617 


0.00724 


832.275 


2,066,588 





832,275 


224,854.834 


0.0610 


1,111.157 





195.166 


195,166 


>1.00 


PA 


tBSI 


421,504.157 


.122180 


0.09046 


0.00562 


-13.374,327 








-13,374.327 


421,504,157 


0.0005 


2.368.853 





-13.374,327 


-13.374,327 


0.16 




1882 


420,207,729 


.081085 


0.08637 


0.00764 


1.786,388 


1.786.388 





1,786.388 


841,711,888 


0.0879 


4,091.865 





11.577,939 


-11.577,939 


0.35 




1883 


416.840.265 


.040000 


0.0«0*3 


0.01006 


21.218,488 


23.015,877 


10.738 


21,208,740 1 


,258,352,151 


0.0888 


5,854,596 


10.739 


9,630,810 


9,641,550 


0.61 




1984 


405,621.888 


.030000 


0.0MM2 


0.00667 


24,588,788 


47.604.876 


23.018.090 


1,568,708 1 


,663,874,048 


0.0883 


8,808,827 


23,029.630 


11,200.520 


34,230,349 


0.20 


Rl 


1881 


43.270.544 


.087730 


0.06261 


0.06261 


-1,524.378 








-1,524,378 


43,270.544 


0.0625 


2,704,842 





-1.524,378 


-1,524,378 


>1.00 




1882 


40.285.720 


.068870 




0.01030 


•484.637 








-484,837 


83,556.264 


0.0588 


2,737,036 





■2,009,015 


-2.009,015 


>1.00 




1883 


38.028.872 


.040000 


0.06197 


0.01130 


857.464 


857.464 





857,464 


122,585,136 


0.0605 


2,772,902 





-1,151,551 


-1,151,551 


>1.00 




1M4 


41.310.420 


.030000 


0.03700 


0.0067* 


280.216 


1.146.070 





288,219 


169,001.558 


0.0548 


2.786,583 





-862,336 


-862,336 


>1.00 


SC 


1981 


58.158.502 


.060619 


0.07640 


0.006*9 


1,004,170 


1.004.170 


456,352 


547,818 


56.158,502 


0.0784 


333,020 


456,352 


547.818 


1,004,170 


0.33 




1882 


53.583.594 


.050259 


0.0aB82 


0.00790 


2,071,505 


3.075,786 


1,768,287 


303,308 


108.742.086 


0.0835 


517.402 


2,224,640 


851,126 


3,075.766 


17 




1883 


53.678,391 


.040000 


0.07085 


0.00001 


1,655,017 


4,731,682 


1,400.409 


255.508 


163,418,487 


0.0784 


672,726 


3,625,048 


1.106.634 


4,731.682 


0.14 




1884 


54,875.425 


.030000 


0.07784 


0.00699 


2.626,240 


7,356,822 


2.525,171 


100,070 


218,283,812 


0.0780 


733,558 


8.150,219 


1,206.704 


7,356,922 


0.10 


SO 


1881 


11.888.284 


.045230 


0.04631 


0.01448 


12,816 


12,816 





12,816 


11,866,284 


0.0463 


171,824 





12,816 


12,816 


>1.00 




1882 


11.988.541 


.042815 


0.03705 


0.00764 


•69,430 


12,816 





-83,438 


23,265,825 


0.0418 


102,636 





-50,623 


-50,623 


>1.00 




1883 


11.882.882 


.040000 


0.02112 


0.00406 


-226.238 


12.816 





-228,236 


35.248,887 


0.0348 


201,667 





-276.859 


-276,859 


0.73 




1884 


11. 748.883 


.030000 


0.02«M 


0.00876 


•10.602 


12.816 





-10,682 


48,888,580 


0.0333 


226,522 





-267,552 


-287,552 


0.79 


bi ™ 


1881 


58.079.820 


.059800 


0.08B50 


0.00680 


1.764.674 


1.764,674 


1,093.806 


660.888 


58,078,820 


0.0885 


401,743 


1,083,806 


660,868 


1,754,674 


0.23 


1882 


61.010.170 


.049800 


0.04ai2 


0.00612 


-30,708 


1,764,874 


•215.881 


176.073 


110.080,088 


0.0708 


508,779 


877,845 


836,941 


1,714,888 


0.30 




1883 


SS.e38,492 


.040000 


0.04466 


0.00442 


264.828 


2.008,502 


162.184 


02.645 


166,729,568 


0.0620 


565,098 


1,040,128 


929,586 


1,969,715 


0.29 




1884 


88.341.339 


.030000 


0.04261 


0.00609 


747.362 


2.766.855 


829.482 


117,870 


224,070,921 


0.0570 


836,751 


1,668,611 


1,047,458 


2,717,087 


0.23 


TX 


1881 


•7.876.396 


.069120 


0.07606 


0.00776 


1,306,852 


1.305,052 


273,713 


1,122,230 


87,575,396 


0.0751 


682,212 


273,713 


1,122,239 


1,395,952 


0.49 




1982 


75.666.483 


.049660 


0.08964 


0.00626 


2.576.272 


3.071.223 


2,178,439 


900,833 


183,140.879 


0.0780 


824,664 


2,450,152 


1,521.072 


3,971,223 


0.23 




1883 


•4.66a,86S 


.040000 


0.06927 


0.00623 


2.770.606 


0.741.018 


2,199,075 


671.620 


257,800,768 


0.0754 


1,272,153 


4,648,227 


2,092.692 


6,741,918 


0.19 




1884 


102.44e.485 


.030000 


0.06««e 


0.00744 


2.761.713 


0.403.631 


2,404,649 


946,869 


360.247.253 


0.0702 


1.483,012 


7,054,076 


2,439,555 


9,493,631 


0.18 


UT 


1881 


34.319.580 


.040000 


0.04673 


0.01181 


200.610 


288,610 





200,610 


34,319.560 


0.0487 


405,314 





299,610 


209,610 


>1.00 




1882 


32.764.359 


.040000 


0.04ft61 


0.00876 


324,506 


824,206 





924.506 


87,079,939 


0.0483 


497,164 





624,206 


824,206 


0.80 




1883 


37.207.756 


.040000 


0.06661 


0.01261 


614.300 


1,238.506 


113,881 


600.310 


104,281,804 


0.0510 


883,602 


113,981 


1,124.525 


1,238,508 


OSS 




1984 


36.843.708 


.030000 


0.06769 


0.00626 


009.125 


2.291.830 


866,388 


128,727 


140,225,402 


0.0534 


760,639 


880.379 


1.251.252 


2,231,630 


0.34 


VA 


1981 


••.066.525 


.091476 


0.03666 


0.00427 


-5,506,724 








-5,508,724 


88,068,525 


0.0350 


423,023 





-5.508,724 


-5.506,724 


0.08 




1882 


•3.924.089 


.066737 


0.04065 


0.00477 


-2.365,668 








•2,365,668 


182,892,814 


0.0382 


618,172 





-7,872,390 


-7.872,390 


0.08 




1883 


95.81 •.862 


.040000 


0.03757 


0.00545 


-232.358 








-232,356 


288,612,486 


0.0380 


808,996 





-8.104,746 


-8,104,746 


0.10 




1884 


•3.253.012 


.030000 


0.03455 


0.00450 


424.301 


424.301 





424,301 


381,885,478 


0.0371 


913,484 





-7,680,445 


-7,680,445 


0.12 


VT 


1881 


28.751.544 


.0431 53 


0.05157 


0.01338 


225,168 


225.188 





225.168 


28,751.544 


0.0518 


358.203 





225,168 


225,168 


>1.00 




1882 


25.837.630 


.041577 


0.04520 


0.00788 


83.810 


318.777 





83.610 


52.589.174 


0.0484 


413,435 





318,777 


318.777 


>1 00 




1983 


25,020,867 


.040000 


0.07881 


0.01853 


871,060 


1.288,837 


267,866 


703.083 


77.610,041 


0.0582 


621,198 


267,966 


1,021,871 


1.288.837 


0.48 




1984 


27,6S»,3e8 


.030000 


0.05834 


0.01127 


783,866 


2.073,704 


662,424 


121,442 


105,268.407 


0.0583 


685,023 


930.301 


1,143,313 


2,073.704 


0.34 



Table 3-9. Application of Rule D to states (continued) 



Table of Contents 



r 



Annual S»ll«Hc» 



STATE Ysar Fad Conlrib 



Targal R-hat 



WA 



Wl 



1981 
1882 
1983 
1084 

1881 
1882 
1883 
1884 



WV 1981 
1882 
1883 
1884 

WY 1881 
1882 



1883 
19*4 



118.607.888 
119.737.415 
130.763.014 
14 7.030,823 



.058243 
.049122 
.040000 
.030000 



09333 
0.06438 
0.04775 
0.04113 



0.01236 
0.00685 
0.00588 
0.00526 



221.181.560 .067083 

235,838,352 .063541 

275.861. 1S1 .040000 

296,2a7.06S .030000 

41,068.616 .066803 

38.295,428 .064401 

38.464.472 .040000 

52.853.558 .030000 



0.08236 0.00714 

0.06470 0.00648 

0.09076 0.00662 

0.06602 0.00712 

0.07361 0.01314 

0.06245 0.00781 

0.02960 0.00614 

0.04806 0.00639 



4,235,162 .040000 0.13747 0.01261 

4.317,706 .040000 0.04771 0.01253 

5.590,719 .040000 0.07666 0.01964 

6.069,734 .030000 0.0S95* 0.01435 



RulaADiM 


•owanca* 
Cumulaiad 








RULE 


D 








1 


Annual 


Annual Valuaa 1 


(II posiilva) 


C»h 


Book 


Fad Conlrib 


Rhal 


slgma(D) 


Cash 


Book 


Toial 


cv 


4,161,585 


4.161,595 


1,750,036 


2,411.559 


118.607,888 


00933 


1,465,993 


1,750.036 


2.411,559 


4,161.595 


35 


1,626,953 


5.988,548 


1,475.174 


351,780 


238.345.303 


0.0788 


1,679.841 


3,225,209 


2,763.339 


5.988.548 


28 


1,013,568 


7.002,117 


736,884 


276.685 


389.128,317 


00678 


1,848,039 


3.962.093 


3,040,024 


7,002,117 


0.26 


1.636.454 


8.838,571 


1.380,985 


255.469 


516.159.240 


00602 


2,003,339 


5,343.078 


3,295.493 


8,638,571 


023 


-1.040,217 








-1.040,217 


221.181,560 


0.0824 


1.579,236 





-1.040.217 


■1.040,217 


>1 00 


284,563 


294.563 





294.563 


457,020,812 


0.0733 


2,197,613 





-745,854 


745,654 


>1.00 


2,866.114 


3.260.677 





2.866.114 


732,682.063 


0.0648 


2,856.515 





2,220.460 


2.220.460 


>1.00 


10,671.940 


13,932,218 


7,050,667 


3.620,673 1 


.028.848,148 


0.0652 


3.550,963 


7.050,667 


5.841,334 


12,892,001 


0.28 


-623,955 








-623,855 


41,066.616 


0.0736 


539.642 





-623.955 


•623,955 


086 


681,184 


691,194 





681.184 


78.364,045 


0.0779 


618.847 





67.239 


67.239 


>1 00 


-382,338 


691,194 





-382.338 


117.628,517 


0.0622 


662.381 





-325.099 


325,099 


>1 00 


857.400 


1.646.595 





957.400 


170.782.076 


0.0578 


742.365 





632.301 


632,301 


>1.00 


412.803 


412,803 


324.951 


87,852 


4,235.182 


0.1375 


53,406 


324,951 


87.852 


412,803 


0.13 


33.280 


446,093 


-3,911 


37.201 


8.552.688 


0.0922 


76,020 


321,040 


125.053 


446.093 


0.17 


206.166 


652,278 


139.250 


66.936 


14.143.603 


0.0661 


116,711 


460.289 


191.989 


652,278 


0.18 


155,324 


607.603 


107,753 


47.571 


20.213.337 


0.0770 


145,629 


568.042 


239.560 


807,603 


0.18 



Total 1981 
1962 
1963 
1964 



72.162.950 72.162.950 

84.867.566 167.130.516 

181.686,739 348.827.248 

230.088.280 978,826.546 



28.900.797 -19,685,383 9,215,414 

82.560.681 6.305,223 88.865,804 

188,240.317 75.204,990 263.445.307 

368,130,578 123,432,787 491.563,365 



•Computed by simple application of Rule A. For states AZ and TX, these differ from the disallowances actually assessed (see 
Table 3-4), and for other states differ slightly from those shown in Table 3-4 because of variations in treatntent of rounding errors. 



Table of Contents 



APPENDIX A 

DESCRIPTION OF THE THREE TEST POPULATIONS 
AND THE SAMPUNG PROCEDURE USED IN SIMULATIONS 



The test populations consist of the cases included in the Federal 
subsamples for the year ending September 30, 1982, for three groups of states. The 
states used were: 

Population A: Illinois, New Jersey, Ohio, Pennsylvania 

Population B: Maryland, Michigan, South Carolina, Texas 

Population C: Arkansas, Colorado, Hawaii, Nebraska, Oregon, 

West Virginia 

For each test population, the states chosen provide a sample of approximately 
1500 cases that could be used as a test population from which samples could be 
drawn, with replacement, to study some of the characteristics of various sampling 
and estimation procedures for AFDC. 

The following tables give some of the characteristics of each of the 
three test populations. Tables A-1 through A-3 provide summary measures. 
Tables A-IA through A-3C list the individual cases, by type. 

From each population, simple random samples simulating state QC 
samples of various specified sizes were drawn in the foUowing way. For each test 
population, the cases for which payment errors (ineligible, overpayment, or 
underpayment) were foimd by the state QC or by the Federal review were termed 
"error cases." Let P denote the proportion of error cases in the population, and let n 
denote the spedHed size of the state sample. 

The number of error cases to be included in the state sample was 
determined by a random draw from the binomial distribution whose parameters are 



A-l 



Table of Contents 

■••- 



Appendix A 



P and n. That number of error cases was then drawn as a simple random sample, 
with replacement, from the set of error cases in the test population. 

For the balance of the state sample, no error cases were involved. 
Consequently, the balance of the sample was drawn as a simple random sample of 
payments from the normal distribution whose mean and variance are those of the 
payments for the set of non-error cases of the population. 

A Federal subsample of n ' was drawn from each state sample. Let pg 
denote the proportion of error cases in the state sample that was selected. The 
number of error cases to be included in the Federal subsample was determined by a 
random draw from the binomial distribution whose parameters are pg ^^^ " '• That 

number of error cases in the state sample was then selected for the Federal 
subsample as a simple random sample, without replacement. 

Subsamples of the non-error cases in the state sample did not have to 
be drawn, since estimates of the average overpayment per case, or of its variance, do 
not depend on the payment values of the non-error cases in the Federal subsample. 

Except as otherwise specified, the statistics given in this report are based 
on repeated simple random samples from the test populations. Listings of the 
various results for each repetition of the sampling are available. Other sampling 
and estimation procediires can be applied if desired. 



A-2 



Table of Contents 



Weatat, Inc. 



Table A-1. Statistics for Population A 



Type of case 


Number 


Percent 


Total cases 


1,478 


100.00 


Cases in which both the Federal and state findings were that there was 






no payment error 


1,266 


85.66 


Cases in which payment errors were found either by the state QC or the 
Federal review 


212 


14.34 


Cases which the state found ineligible. Table A-IA lists these rases, 
showing the monthly payment and the Federal finding for each case. 
In this table, underpayments are shown as zero (as they are treated in 
the analyses). 


62 


4.19 


Cases in which the state found no error or only underpayment error, and 
for which the Federal review foimd an overpayment. Table A-2A lists 
these cases, showing the monthly payment and the Federal finding. 


49 


3.32 


Other cases in which the state found an overpayment error. 
Table A-3A lists these cases, showing the monthly fxayment, the 
state finding, and the Federal finding. 


101 


6.83 





State 


Federal 


Statistic 


finding 


finding 


Average monthly payment 


2%.22 


_. 


Variance of monthly payment 


64,892.93 


— 


Standard deviation of monthly payment 


254.74 


-- 


Coefficient of variation of payments 


0J6 


-- 


Average monthly overpayment 


17.19 


21.62 


Variance of overpayments 


3,762.48 


4,970.75 


Standard deviation of overpayments 


6134 


70.50 


Coefficient of variation of overpayments 


3.57 


3.26 


Skewness (M?/<A 


n/a 


3.80 


Ku^tosis(^^/0*) 


n/a 


17.70 


Percent of cases with overpayments 


11.03 


12.65 



Correlation of state and Federal findings of overpayment errors 

Regression coefficient for the regression of the Federal findings of 
overpayment to the state finding 

Overpayment error rale 



.828 

.952 
.0730 



A-3 



Table of Contents 



Appendix A 



Table A-2. Statistics for Population B 



Type of case 


Number 


Percent 


Total cases 


1,480 


100.00 


Cases in which both the Federal and state findings were that there was 
no payment error 


1,260 


85.14 


Cases in which payment errors were found either by the state QC or the 
Federal review 


220 


14.86 


Cases which the state found ineligible. Table A-IB lists these cases, 
showing the monthly payment and the Federal finding for each case. 
In this table underpayments are shown as zero (as they are treated in 
the analyses). 


76 


6.14 


Cases in which the state found no error or only imderpayment error, and 
for which the Federal review found an overpayment. Table A-2B lists 
these cases, showing the monthly payment and the Federal finding. 


43 


2.91 


Other cases in which the state foimd an overpayment error. 
Table A-3B lists these cases, showing the nnonthly payment, the 
state finding, and the Federal finding. 


101 


6.82 





State 


Federal 


Statistic 


finding 


finding 


Average monthly paynnent 


210.06 


._ 


Variance of monthly payment 


14,633.67 


— 


Standard deviation of monthly payment 


120.97 


— 


Coefficient of variation of payments 


0.58 


-- 


Average monthly overpayment 


15.04 


16.69 


Variance of overpayments 


3,175.10 


3,487.75 


Standard deviation of overpayments 


56.35 


59.06 


Coefficient of variation of overpayments 


3.75 


3.54 


Skewness(n?/<^) 


n/a 


4.90 


Kurtosis (j.^/0*) 


n/a 


32.10 


Percent of cases with overpiayments 


11.96 


13.11 



Correlation of state and Federal findings of overpayment errors 

Regression coefficient for the regression of the Federal findings of 
overpayment to the state finding 

Overpayment error rate 



.940 

.985 
.0795 



A-4 



Table of Contents 



VJettat, Inc. 



Table A-3. Statistics for Population C 



Type of case 


Number 


Percent 


Total cases 


1,525 


100.00 


Cases in which both the Federal and state findings were that there was 






no payment error 


i;3i7 


86.36 


Cases in which payment errors were found either by the state CX^ or the 






Federal review 


208 


13.64 


Cases which the state found ineligible. Table A-IC lists these cases. 






showing the monthly payment and the Federal finding for each case. 






In this table underpayments are shown as zero (as they are treated in 






the analyses). 


68 


4.46 


Cases in which the state found no error or only underpayment error, and 






for which the Federal review found an overpayment. Table A-2C bsts 






these cases, shovraig the monthly payment and the Federal finding. 


54 


3.54 


Other cases in which the state found an overpayment error. 






Table A-3C lists these cases, showing the monthly payment, the 






state folding, and the Federal finding. 


86 


5.64 





State 


Federal 


Statistic 


finding 


finding 


Average monthly payment 


254.66 


__ 


VariaiKre of monthly payment 


37,495.08 


-- 


Standard deviation of monthly payment 


193.64 


— 


Coefficient of variation of payments 


0.76 


-- 


Average monthly overpayment 


13.66 


16.87 


Variance of overpayments 


3,312.03 


4365.03 


Standard deviation of overpayments 


5735 


66.07 


Coefficient of variation of overpayments 


4.21 


3.92 


Skewness (^/o^) 


n/a 


4.50 


Kurtosis (m'*/0*) 


n/a 


24.70 


Percent of cases with overpayrr«nts 


10.10 


11.21 



Correlation of state and Federal findings of overpayment errors 

Regression coefficient for the regression of the Federal findings of 
overpayment to the state finding 

Overpayment error rate 



.809 

.928 
.0662 



A-5 



Table of Contents 



Appendix A 



Table A-IA. Cases in Population A that state found ineligible, vyrith Federal finding 



Amount 


overpaid 


Amount 


overpaid 


State 


Federal 


State 


Federal 


129 


129 


270 


270 


250 


250 


318 


318 


153 


153 


302 


302 


- 1 


368 


302 


302 


jor 


368 


250 


250 


250 


250 


125 


.5 


250 


250 


434 


434 


302 


302 


319 





348 


348 


273 


273 


273 


273 


^3 


273 


360 


360 


273 


273 


137 


137 


360 


360 


273 


273 


?m 


263 


360 


360 


216 


216 


360 


360 


216 


216 


360 


360 


216 


216 


273 


273 


216 





360 


360 


216 


216 


350 


350 


216 


216 


273 


273 


111 


111 


216 


216 


2^ 


263 


216 


216 


131 


131 


216 


216 


395 


395 


216 


216 


321 


321 


111 


111 


273 


273 


216 


216 


321 


321 


263 


263 


172 


172 


216 


216 


265 


265 


262 


262 


387 


387 


318 


318 


172 


172 


381 


381 


360 


360 



Total cases 62 

Cases with Federal zero 2 



A-6 



Table of Contents 



Westat, Inc. 



Table A-IB. Cases in Population B that state found ineligible, with Federal finding 



Amount 


overpaid 


Amount 


overpaid 


State 


Federal 


State 


Federal 


118 


118 


240 


240 


55 


55 


606 


606 


118 


118 


259 


259 


141 


141 


225 


??'S 


112 


112 


84 





12 


12 


409 


409 


141 


141 


395 


395 


164 


164 


273 


273 


23 


23 


434 


434 


141 


141 


413 


413 


85 


85 


206 


206 


153 


153 


491 


491 


141 


141 


327 


327 


118 


118 


102 


102 


164 


164 


133 


133 


102 


102 


172 


172 


102 


102 


163 


163 


102 


102 


97 


97 


102 


102 


204 


204 


48 


48 


141 


141 


133 


133 


118 


118 


163 


163 


118 


118 


102 


102 


14 


14 


163 


163 


85 


85 


133 


133 


23 


23 


72 


72 


118 


118 


102 


102 


23 


23 


211 


211 


85 


85 


211 


211 


230 


230 


r/u 


270 


295 


295 


247 


247 


67 


67 


326 


326 


355 


355 


326 


326 


2J0 


I/O 


134 


134 


211 


211 


211 


211 


211 


211 


211 


211 


247 


247 


211 


211 


326 


326 


295 


295 


326 


326 



Number of cases 76 

Cases with Federal zero 1 



A-7 



Table of Contents 



Appendix A 



Table A-IC. Cases in Popiilation C that state found iric^jgible, with Federal finding 



Amount 


overpaid 


Amount 


overpaid 


State 


Federal 


State 


Federal 


140 


140 


98 


98 


122 


122 


116 


116 


122 


122 


186 


186 


89 


89 


140 


140 



140 


140 


122 


122 


122 


122 


89 


89 


247 


247 


247 


?47 


283 


:83 


247 


247 


63 





50 


50 


168 


168 


185 


'85 


523 


523 


175 


175 


375 


375 


468 


468 


72 


72 


155 


155 


86 


86 


286 


286 


547 


547 


480 


480 


286 





286 


286 


134 


134 


164 


164 


54 


54 


86 


86 


164 


164 


164 


164 


164 


164 


164 


164 


164 


164 


164 


164 


Total cases 


68 


Cases with Federal zero 


2 



98 


98 


116 


116 


186 


186 


140 


140 


59 


59 


86 


86 


415 


415 


247 


247 


222 


222 


224 


224 


390 


390 


365 


365 


420 


420 


420 


420 


45 


45 


560 


560 


240 


240 


560 


560 


231 


231 


409 


409 


58 


58 


206 


206 


206 


206 


206 


206 


206 


206 


249 


249 


164 


164 


122 


122 


179 


179 


10 


10 


142 


142 


122 


122 


100 


100 


140 


140 



A-8 



Table of Contents 



Westat, Inc. 



Table A-2A. Cases in Population A for which the state found no error or only under]>aynient 





Federal 


Payment 


Federal 


Payment 


Ineligible 


Overpayment 


Ineligible 


Overpayment 


302 


302 




221 







240 




12 


236 







236 


236 




250 


250 




360 




87 


302 


302 




195 




68 


357 







360 




132 


236 







414 


414 




334 




165 


234 







477 


477 




174 







413 


413 




324 


324 




324 




100 


216 




105 


263 




245 


263 


763 




216 


216 




90 







131 







327 




189 


263 




47 


216 




101 


327 




64 


216 


216 




327 


327 




224 







763 


?A3 




216 







48 







175 


175 




438 




57 


113 







194 







381 




63 


404 




153 


381 


381 




337 




211 


438 




57 


214 




140 


265 







223 







321 














Total cases 49 

Federal finding: 

No overpayment cases 16 

Ineligible cases 15 

Other overpayment cases 18 



A-9 



Table of Contents 



Appendix A 



Table A-2B. Cases in Population B for which the state found no error or only underpayment 





Federal 


Payment 


Federal 


Payment 


Ineligible 


Overpayment 


Ineligible 


Overpayment 


118 




11 


314 







118 




50 


395 




35 


141 




23 


450 







23 







249 







107 







318 







133 


133 




306 







133 


133 




773 







102 


102 




182 







72 


72 




314 







44 







383 







193 




31 


204 




32 


113 







236 




44 


94 







133 


133 




326 




284 


106 







270 


270 




118 




10 


47? 


47? 




118 


118 




??S 







118 







502 







lis 


118 




29 







131 


131 




205 







326 




28 


305 







270 


270 




386 




56 









Total cases 43 

Federal finding: 

No overpayment cases 21 

Ineligible cases 11 

Other overpayment cases 11 



A-10 



Table of Contents 



IIISIUI, lUL. 



Table A-2C. Cases in Population C for which the state found no error or only underpayment 





Federal 


Payment 


Federal 


Payment 


IneUgible 


Overpayment 


Ineligible 


Overpayment 


83 




48 


59 







247 







116 


116 




130 







264 







76 







62 







434 







856 


856 




375 


375 




56 







297 







448 


448 




57 







210 


210 




280 




10 


350 


350 




140 







286 


286 




190 




79 


436 




39 


150 







257 




200 


355 


355 




286 




177 


323 







239 




140 


?R6 







177 


117 




150 


150 




134 


134 




286 


286 




176 







253 







164 




17 


204 







136 




82 


361 




278 


176 




30 


786 


286 




134 







339 




199 


122 




33 


547 




67 


100 







69 







51 


51 




98 







20 







65 







100 


100 




161 







173 








Total cases 54 

Federal finding: 

No overpayment cases 26 

Ineligible cases 14 

Other overpayment cases 14 



A-11 



Table of Contents 



Appendix A 



Table A-3A. Cases in Population A for which the state found eligible but overpayment 





1 
State 


Federal 




State 


Federal 












Payment 


overpayment 


Ineligible 


Overpaymen 


Payment 


overpajrment 


Ineligible 


Overpayment 


250 


98 




98 


326 


23 




287 


302 


52 




52 


478 


40 




40 


250 


1 ' ,.' 




ro 


381 


"3 




63 


250 


170 




1 


536 


■i 


536 




302 


62 




62 


395 


Jl 




51 


225 


192 




192 


264 


89 




89 


225 


72 




72 


714 


200 




200 


80 


9 




9 


368 


«? 




58 


649 


424 




424 


309 






52 


153 


73 




73 


250 






24 


302 


52 




52 


250 






170 


237 


65 




65 


242 


40 




4lj 


250 


30 




30 


368 


66 




66 


502 


60 




60 


302 


222 




222 


236 


56 




56 


700 


51 




51 


468 


54 




54 


302 


52 




52 


360 


87 




87 


284 


80 




80 


246 


136 







378 


54 




54 


360 


87 







414 


54 




54 


188 


166 




1; ^ 


414 


54 




54 


414 


54 




54 


522 


54 




54 


522 


54 




54 


360 


90 




9", 


273 


136 




136 


311 


65 




15 


273 


136 




136 


414 


54 







273 


136 




136 


246 


136 




136 


273 


136 




136 


180 


41 




41 


360 


87 







263 


47 







360 


91 




91 


216 


99 


216 




414 


141 




141 


127 


63 




63 


414 


141 




U 


263 


37 




37 


263 


47 




47 


206 


131 




131 


262 


64 




64 


200 


64 




64 


164 


14 




14 


216 


105 




105 


263 


51 




51 


263 


47 




47 


475 


148 




148 


327 


64 




64 


1105 


104 




104 


167 


18 




18 


263 


152 




152 


341 


84 




84 


263 


47 




47 


424 


43 




43 


327 


64 




64 


384 


63 




63 


381 


301 




301 


481 


43 




43 


302 


47 




47 


335 


73 




73 


536 


98 




98 


253 


12 







286 


55 




55 


385 


63 




63 


438 


120 




120 


438 


194 




194 


451 


144 




119 


94 


43 




43 


381 


63 




63 


327 


73 




73 


318 


129 




129 


74 


34 




34 


441 


13 




13 


262 


90 




90 


436 


57 




57 


224 


220 




220 


234 


46 







84 


22 




22 


318 


86 




44 











101 cases, of which 7 showed no Federal overpayment 



12 



Table of Contents 



Westat, Inc. 



Table A^B. Cases in Population B for which the state found eligible but overpaymoit 





State 


Federal 




State 


Federal 












Payment 




Indigibie 




Payment 


overpayment 


Ineligible 




139 


63 







318 


11 




11 


121 


43 


121 




568 


76 




76 


118 


47 




47 


354 


106 




106 


164 


14 




37 


106 


87 




87 


110 


12 




12 


327 


68 




68 


183 


53 




62 


568 


76 




76 


102 


28 




28 


418 


76 




76 


184 


21 




21 


406 


59 




59 


163 


129 




129 


506 


18 




56 


193 


127 




127 


253 


9 




9 


270 


177 




112 


421 


13 




11 


68S 


42 




42 


276 


23 







211 


79 




73 


241 


52 


241 




270 


111 




117 


451 


51 




51 


270 


59 




70 


372 


31 




31 


211 


91 




91 


190 


51 




51 


270 


50 




41 


439 


112 




112 


326 


266 




266 


305 


21 







270 


141 




141 


297 


33 




33 


270 


59 




59 


607 


74 




74 


211 


91 




91 


543 


238 




171 


222 


56 




60 


102 


30 




30 


553 


31 




31 


223 


30 




30 


404 


20 




20 


102 


17 




17 


306 


105 




105 


163 


17 




17 


640 


17 




17 


72 


18 




18 


348 


206 




206 


133 


32 




32 


421 


73 




73 


218 


14 




14 


601 


316 




316 


82 


34 




23 


360 


75 




75 


164 


120 




120 


206 


116 




116 


141 


46 




46 


511 


13 




13 


164 


16 




16 


487 


73 







118 


70 




70 


405 


162 




162 


118 


63 




63 


548 


74 




48 


118 


63 




63 


395 


67 




68 


164 


31 




31 


530 


97 




97 


118 


30 




30 


478 


50 




50 


164 


108 




108 


511 


83 




83 


81 


23 




23 


203 


83 




83 


141 


23 




23 


576 


19 




19 


69 


32 




32 


460 


320 




320 


164 


62 




62 


620 


595 




595 


85 


32 




32 


641 


208 




208 


510 


56 




56 


305 


75 




75 


131 


5 




9 


403 


32 




32 


295 


252 




252 


296 


67 




67 


295 


65 




65 


274 


85 




85 


230 


90 




90 


458 


28 




28 


270 


59 




70 


327 


193 




193 


326 


266 




266 


292 


67 




67 











101 cases, of which 4 showed no Federal overpayment 



A-13 



Table of Contents 



Appendix A 



Table A-3C. Cases in Population C for which the state found eligible but overpayment 





State 


Federal 




State 


Federal 












Payment 


overpayment 


Ineligible 


Overpaymeni 


Payment 


overpaymeni 


Ineligible 


Overpayment 


122 


105 




105 


59 


49 




49 


450 


137 




137 


140 


39 







308 


152 




152 


253 


63 




63 


247 


227 




227 


253 


83 




83 


183 


94 




94 


247 


62 




62 


247 


158 




158 


379 


61 




61 


91 


6 




6 


379 


55 







383 


78 




78 


379 


105 




105 


247 


67 




67 


543 


47 




47 


214 


17 




17 


298 


59 




66 


189 


28 




28 


359 


84 




84 


313 


66 




66 


468 


120 




120 


247 


6 




6 


531 


63 




63 


546 


15 




15 


474 


19 




19 


546 


396 




396 


336 


112 




112 


546 


15 




15 


222 


81 




81 


521 


468 







373 


53 




53 


128 


39 




39 


350 


106 




106 


254 


17 




17 


390 


93 




93 


546 


78 




78 


410 


78 




44 


334 


77 




77 


122 


63 




63 


420 


70 




70 


448 


25 




25 


490 


210 




210 


118 


22 







420 


80 







350 


70 


350 




350 


70 




70 


174 


10 







164 


18 







286 


200 




200 


203 


9 







339 


48 




48 


301 


8 




8 


403 


33 




33 


323 


15 




15 


376 


30 




30 


763 


55 




55 


266 


18 




18 


286 


200 




200 


222 


19 




19 


329 


53 




53 


212 


52 




52 


281 


75 




75 


134 


116 




116 


134 


44 




44 


134 


44 




44 


164 


43 




43 


98 


66 




66 


164 


18 




18 


206 


30 




30 


90 


64 




64 


90 


17 


90 




215 


39 




39 


206 


42 




42 


164 


30 




30 


76 


10 




10 


206 


42 




42 


142 


32 




32 


206 


148 




148 


100 


49 




49 


206 


148 




148 


100 


17 




17 


164 


25 







72 


10 




10 



86 cases, of which 9 showed no Federal overpayment 



A-14 



Table of Contents 

-- 



APPENDIX B 
EVALUATION OF THE REGRESSION AND DIFFERENCE ESTIMATORS 



Classical regression analysis assumes a linear relationship between the 
dependent and the independent variables, and that the dependent variable is (at 
least approximately) normally distributed for each value of the independent 
variable. As noted earlier in this report (Section 2.2), the requirements of classical 
regression analysis are reasonably well satisfied in the application of the regression 
estimator when one considers the fact that the "independent" variable is the Federal 
subsample mean of the error per case as determined by the state review and the 
"dependent" vairiable is the mean error p>er case as determined by the Federal re- 
review for the cases in the same subsample. Relationships between these means 
were illustrated in Section 2.2 (Figure 2-1) by scatter diagrams for 1000 samples 
drawn from Test Population A for each of four sample sizes. We include here 
similar scatter diagrams for the other two test populations which we have examined 
(Figures B-1 and B-2). 

We emphasize that the linearity is not required for the regression 
estimator to be consistent (i.e., unbiased in large enough samples). However, the 
close approximation to linearity that is illustrated in the figures leads to negligible 
bias even for the smallest sizes of Federal subsamples. A little algebra brings out 
how the bias decreases with sample sizes, and becomes negligible for large enough 
samples. 

The regression estimator of the mean error per case is 
X" = X' +b'(y -y') . 
Then, conditional on the state sample S, the expected value of x" is 
E(i"ls) = X +E {b'(y -y") Is) 



B-l 



Table of Contents 



Append 



ix B 



Figure B-1. Mean findings of dollar error per case in 1000 independent samples for each of foiir sample 
sizes, Population B 



n=2400, n'=360 



n=1200, n=360 



•w 



30 



Q 

Z 




20 30 

STATE FINDING 



40 



30 



ae 



Q 

z 



10 



40 




10 



20 

STATE FINDINB 



30 



40 



n=880, n'=260 



40 



'i 

a 



10 



/ 




» 30 

STATE F"lNDlr€ 



n=350, n'=160 



40 




39 30 

STATE FlhtDIhC 



B-2 



Table of Contents 



Westat, Inc. 



Figure B-2. Mean findings of dollar error per case in 1000 independent samples for each of four sample 
sizes. Population C 



n=2400, n'=360 



n=1200, n'=360 



40 



a 

z 




SB 30 

STATE FINDING 



40 




20 

STATE FINDINS 



40 



391 



n=880, n'=260 



y 



/ 



/ 








• /iii 


jt^* 


20 


i9t 41 


• 


10 







10 



50 

STATE FINDIN6 



40 



30 



n=350, n'=160 






1 . •*?* •• • 



+« 




STATE FINDI« 



B-3 



Table of Contents 



App 



endix B 



and therefore over all possible state samples, 

e(x") = X +EE {b'(y -y') Is} . 

Thus, the bias of JT' as an estimate of X is 

HE {b'(y -y)ls) . 
We note that 

E {b'(y -y) I s} = - Gov (b'.y- 1 S) 

- PbylS^IS '^JFIS 

Since each of these standard deviations is of order l/Vn^and the correlation 
coefficient is no greater than 1 in absolute value, the bias is of order no greater than 
1/n'. Thus, s.e bias decreases with increases in the size of the Federal subsample 
and is negligible for sufficiently large samples. 

_ A I 

Also, since the bias of x" (and of R) is of the order — , and the standard error 

n 

is of the order —= , the ratio of the bias to the standard error de. eases with 
Vn' 

increasing sample size and is negligible for large enough samples. 

We have also examined the distribution of the residuals, dj = x'j - (a + b y\), 
for the lines of regression shown in Figure 2-1 in Chapter 2, and in Figures B-1 and 
B-2 above. The coefficients a and b of the regression line are computed from the 
known population parameters. Sunrunary measures for the distributions of the 1000 
residuals are given in Table B-4 for each of the four sample sizes for the three test 
populations. The summary measures in the table are defined as follows: 

Mean d = E dj/lOOO 



B-4 



Table of Contents 



Weatat, Inc. 



Standard deviation a = [Z (di-d)2/100o]V2 



Z (d. - d)3 
Skewness — ^^ — /o3 



S (dj - d)* , 

Kurtosis — Yq^ — /^ 

It is seen from the measures of skewness and kurtosis that the 
distributions show some moderate departure from normal, but are reasonably close 
to the values for a normal distribution of for skewness and 3 for kurtosis. 



6.1 Comparison of the Regression and Difference Estimators 

We initially had some concern that the approximations that are 
involved in the regression estimator and the estimator of its variance may not be 
totally satisfactory because of the relatively small sizes of the Federal subsamples. 
The so-called difference estimator, on the other hand, provides unbiased estimates 
for any sample size and an unbiased estimate of its variance is available. We have, 
therefore, on occasion, considered the use of the difference estimator to replace the 
regression estinvator. To compare these alternative estimators in the context of the 
AFDC quality control program, we have simulated sampling from Population A, 
described in Appendix A. 

A 

The regression estimator R is defined by 
R = {ST + bCy-yT)}/! 
and the difference estimator R is defined by 

R = (x' + k(y-f)}/t 



where 



B-5 



Table of Contents 

-- 



Appendix B 



x' = Exj/n' is the average overpayment in the Federal subsample whose size 

is n', the average being computed over all cases in the 
subsample, regardless of whether there was an overpayment, as 
determined by the Federal review; 

y = Zyj/n is the average overpayment in the state QC sample whose size is 

n, as determined by the state review; 

y' = Syj/n' is the average overpayment in the Federal sample, as 

detennined by the state QC review; 

b = Kxi-SDCyi-jD/Kyi-j^Z 

is an estimate of the regression coefficient, as estimated from the 
Federal subsample; 

k is a constant which, if it were equal to the true value of the 

regression coefficient, would minimize the variance of the 
difference estimator; 

Xj, yj denote respectively the Federal and state determination of the 

overpayment for case i; 

t is the average AFDC payment per case in the state IC sample. 

From Population A, 1000 samples were drawn using simple random 
sampling (see Appendix A) for various sample sizes to simulate state QC samples, 
and from each sample a simple random subsample was drawn to simulate a Federal 
subsample. For each sample, the regression estimate and three difference estimates 
using three values of the constant k were computed, as well as the appropriate 

A 

estimates of their variances. The standard error of the regression estimate R is 
estimated by 

si = Sx{(l-r2(l.n7n))/n'}V2/-t 



B-6 



Table of Contents 



West at, Inc. 



and the standard error of the difference estimate R for a given value of k is 
estimated by 

sftflc) = {(l-n7n)(s^ + k2sy-2krs^Sy)/n' + s2/np/2, 

where 

s^ = I(xj-5r)2/(n'-l) 

is the unit variance of overpajonents as determined by the Federal 
review for the cases in the Federal subsample, and 

4 = I(yi-y^)V(n'-l) 

is the unit variance of overpayments as determined by the state QC 
review for the cases in the Federal subsample. 

Results of the simulation comparing the estimators are shown in 
Tables B-1 and B-Z 

The true value of the overpayment error rate in Population A is .0730. 
Table B-1 shows that the average value of R, estimated from the 1000 independent 
samples is very close to the true value for each of the three sample sizes. This 
indicates, as discussed more fully below, that the bias, if any, of the regression 
estimator is trivial for this population, even for the small Scunple sizes considered. 

The fact that the average values of the difference estimates R differ slightly from the 
true value is due to sampling variation, for the difference estimator can be shown to 
be unbiased. 

Table B-2 shows, for each of the four estimators and for each of the 
three sample sizes, the variance (i.e, the square of the standard error) of the 
estimated payment error rate, the average of the estimated variances given by the 
formulas above, and the standard deviation of the estimated variances. We note 
that the variances, estimated by 1000 repetitions of the sampling procedure, differ 
very little among the four estimators, for each of the sample sizes. The average of 



B-7 



Table of Contents 



Appendix B 



the variance estimates also appears to differ little among the four estimators of the 
payment error rate. The fact that the average of the variance estimates is slightly 
smaller than the estimate of the true variance is attributable to sampling variation, 
since the variance estimator for the difference estimator of the payment error rate 
can be shown to be unbiased. For each size of sample, the four estimates of the 
payment error rate and of its variance were made from the szune sample and hence 
are expected to be similar. The reasonable interpretation of these results is that the 
bias of the estimator of the variance of the regression estimate is trivial. 

We note also that the standard deviation of the estimated variance 
increases with a decrease in the sample size, approximately as predicted by statistical 
theory. 



B.2 Validity of the Regression Estimator 

Examination of Table B-3 indicates that while the average value of the 
estimated payment error rate is very close to the population value, in 11 of the 
12 independent estimates the average value is somewhat less than the true payment 
error rate for the population. The largest of the individual differences is 2.3 times its 
estimated standard error. These results suggest a small downward bias of the 
regression estimator. However, the indicated biases are all so small that they 
contribute trivially (less than 1 percent) to the mean square error, and are so small 
that they can be neglected. There is no such indication of a bias in the estimates of 
the standard error of the estimated payment error rate. 

We emphasize that the absence of appreciable bias in the regression 
estimator or in the estimator of its variance does not suffice to ensure that the 
estimator of the payment error rate is satisfactory. The variability of the estimated 
variance is quite large, as can be seen from the simulation results presented in 
Table B-3. Hence, much of the variation of the standard error between years for a 
given state, and much of the variation between states in a given year, may be due 
simply to sampling error. 



B-8 



Table of Contents 



West at. Inc. 



Various sample sizes have been used in this appendix and elsewhere 
in this report. One set of sample sizes, in particular, 

n = 1200 n' = 180 
n = 500 n' = 80 
n = 300 n' = 50 

was used in initial analyses. The largest of these sample sizes was intended to 
approximate the six-month sample sizes in use in the larger states. The smaller 
sample sizes were chosen to evaluate results with small Federal sample sizes even 
smellier than those in use. Later, in order to approximate more nearly many of the 
armual sample sizes cxirrently in use in AFDC, additional sample sizes were used in 
the simulations, as follows: 

n = 2400 n" = 360 

n = 1200 n' = 360 

n = 880 n" = 260 

n = 350 n' = 160 

These sample sizes were generally used in the more recent analyses. 

Similarly, Population A was the only test population that was defined 
initially. Many of the ezurlier simulations used only that test population. Later, Test 
Populations B and C were defined, in order to examine the stability of the 
conclusions for various populations. Generally, the conclusions were found to be 
very similar for the test populations, and consequently, some of the analyses were 
limited to one or two test populations. 

However, many of the simulations and analyses were carried through 
for all three test populations. For example. Tables C-2A through C-2C in 
Appendix C show a number of comparable simulation results for all three test 
populations. From those tables, we sunrunarize in Table B-3 the regression estimates 
of the overpayment error rate for each of four sample sizes for each of the three test 
populations, and their estimated standard errors, and comparisons can be made 
with the true overpayment error rates that are being estimated. 



B-9 



Table of Contents 



Appendix B 



Table B-1. Average values of the estimated payment error rate R and its estimated standard deviation 
based on 1000 independent samples from Population A. - estimator and sample size 





Average 


Standard 


Avera^ 


Standard 


Average 


Standard 


Estimator 


R 


deviation 


R 


deviation 


R 


deviation 


Regression 


0.0727 


0.0118 


0.0727 


0.0176 


o.or3 


0.0228 


Difference 














k=l 


O.L. -:8 


0.0117 


0.0728 


0.0173 


0.0725 


0.0222 


k=.9 


0.0728 


0.0118 


0.0727 


0.0173 


0.0726 


0.0223 


k=.8 


0.0728 


0.0120 


0.0726 


0.0176 


0.0727 


0.0228 



B-10 



Table of Contents 



VJestat, Inc. 



Table B-2. Variance of the esrimated payment error rate and the average of estimates, by estimator and 
sample size (based on 1000 independent samples from Population A) 





Sample size 


Sample size 


Sample size 




n= 


1200,n'=180 


n=500,n'=80 


n=300, n'=50 








Standard 






Standard 






Standard 






Average 


deviation 




Average 


deviation 




Average 


deviation 






vanance 


of 




vanance 


of 




vanance 


of 


Estimator 


Variance 


estimate 


variance 


Variance 


estimate 


variance 


Variance 


estimate 


variance 


Regression 


1.39E-04 


1.30E-04 


.6300E-04 


3.10E-04 


2.90E-04 


2.06E-04 


5.20E-04 


4.70E-04 


4.26E-04 


Difference 




















k=l 


1.37E-04 


lJlE-04 


.6400E-04 


2.99E-04 


2.94E-04 


2.08E-04 


4.93E-04 


4.79E-04 


4.33E-04 


k=.9 


1.39E-04 


lJlE-04 


.6300E-04 


2.99E-04 


2.94E-04 


2.07E-04 


4.97E-04 


4.79E-04 


4.30E-04 


V=S 


1.44E-04 


1.35E-04 


.6300E-04 


3.10E-O4 


3.03E-O4 


2.07E-04 


5.20E-04 


4.94E-04 


4.30E-04 


Average 


1.40E-O4 


1.32E-04 


.6300E-O4 


3.05E-04 


2.95E-04 


2.07E-04 


5.08E-04 


4.81E-04 


4.30E-04 



B-n 



Table of Contents 



Table B-3. Some summary statistics from 1000 simulations for Populations A, B, and C 





Sample size 


Test population 




n 


n' 


A 


B 


C 


R 


2400 


360 


.07297 


.07945 


.06623 


ft 






.07306 


.07893 


.06592 


d| 






.00025 


.00023 


.00028 


dft 






.00792 


.00736 


.00872 


^ft 






.00791 


.00713 


.00861 


s.e. (Tft) 






.00004 


.00004 


.00007 


s.e. (s^) 


1200 


360 


.00138 


.00139 


.00227 


ft 






.07245 


.07906 


.06601 


ft| 






.00027 


.00026 


.00030 


*ft 






.00839 


.00807 


.00937 


h 






.00884 


.00895 


.00966 


s.e. (Tft) 






.00004 


.00004 


.00007 


s.e. ($j() 


880 


260 


.00126 


.00139 


.00214 


1 






.07271 


.07882 


.06564 


H 






.00033 


.00031 


.00035 


6fl 






.01036 


.00973 


.01091 


»ft 






.01033 


.01040 


.01116 


I.e. (i"ft) 






.00006 


.00006 


.00009 


«.e. (srf) 


350 


160 


.00182 


.00190 


.00289 


ft 






.07290 


.07930 


.06607 


*i 






.00048 


.00049 


.00051 


d« 






.01513 


.01560 


.01624 


•ft 






.01451 


.01544 


.01552 


s.e. (Tft) 






.00009 


.00011 


.00015 


s.e. (s^) 






.00292 


.00363 


.00471 



Definihons: 

R True payiMRt tnor rat* 

A 

R Ettlnuled onr nic far a *ingi« lunpia 

Maui valua of 1000 aattmataa of R 



R 

A* 

"R 



•R 

•r 



Esti2T<jtcd ttandard error (rf R 



Eitln\ated •tandard error of a 



•.e 



r<ii 



EaUznatad (tandard arror of R far a iingla lampla 
Maan aattmala of tha alandard arror of R 

A 

Eadnatad atandard arror of tf^ 

Estimated atandard error of 9 o 



B-12 



Table of Contents 



Westat, Inc. 



Table B-4. Summary measures for distribution of residuals, for regression of 7 on y ' 







Sampli 


e size 






2400/360 


1200/360 


880/260 


350/160 


Population A 
Mean 


0.000 


0.000 


0.000 


0.000 


Standard deviation 


2.044 


2.043 


2.491 


3.052 


Skewness 


0.383 


0.353 


0.485 


0.538 


Kurtosis 


3.398 


3.084 


3.045 


3.432 


Population B 
Mean 


0.000 


0.000 


0.000 


0.000 


Standard deviation 


1.029 


1.008 


1.173 


1.532 


Skewness 


0.776 


0.823 


0.885 


1.122 


Kurtosis 


3.681 


3.872 


3.900 


4.444 


Population C 
Mean 


0.000 


0.000 


0.000 


0.000 


Standard deviation 


1.988 


1.970 


2.281 


3.061 


Skewness 


0.480 


0572 


0.636 


0.845 


Kurtosis 


3.090 


3.631 


3.648 


4.092 



B-13 



Table of Contents 

-- 



APPENDIX C 
COMPUTATION OF CONHDENCE INTERVALS 



Confidence intervals for the payment error rate are produced in the 
current AFDC quality control program in the following way. An estimate of the 
standard error of the estimated payment error rate is computed by the formula 
given for Sr in Section 1.1 of Chapter 1 (Equation (3)) and also in Appendix B. The 
lower and upper bounds of the nominal confidence interval at a given confidence 
level are defined by R ± t sr, where, for example, t=1.96 for the 95 percent confidence 
level and t=1.645 for the 90 percent confidence level. These values of the 

A 

coefficient t are appropriate if R were a mean estimated from a simple random 
sample from a normal distribution, and Sr its estimated standard error. This is a 
commonly used procedure. Such confidence intervals are referred to as nominal 
confidence intervals for the specified level of confidence (say 95 percent) because the 
actual probabilities may not conform to the specified level of confidence. 

A 

Suppose that the samples were large enough that R and sr were 

approximately normally distributed and also large enough that the coefficient of 
variation of Sr was small (say less than .02). For a nominal confidence level of 

95 p>ercent, these conditions are sufficient for the actual probability to be close to 
2.5 percent that the lower bound of the interval is greater than the value being 
estimated, Z5 percent that the upper bound is less than the value being estimated, 
and 95 percent that the value being estimated is between the bounds. Similar 
statements hold for the 90 percent confidence interval. (See the attached Technical 
Note for Appendix C.) 

For the QC samples in use in AFDC, the distribution of R appears to be 
reasonably dose to normal, although still slightly skewed to the right and somewhat 
more skewed for the smaller sample sizes (see Figure 2-2 in Section 2.3 of the 

report). The distribution of sr is also skewed but still reasonably approaching 



C-l 



Table of Contents 



Appendix C 



normality (see Figure C-1). Moreover, and particularly relevant, is that the 
coefficient of variation of sr is quite large, being several times larger than it would 
be if the estimate R were the sample mean of a normally distributed variable based 
on a sample of size n', and Sr were the associated estimate of its standard error. 
Also, R and sr are positively correlated. The results are not sensitive to that 
correlation (which remains constant with increasing sample size), but are highly 
sensitive to the coefficient of variation of Sr (which decreases wi:h increasing 
sample size). 

Estimated values of the coefficie'it of variation Vo« and of the 

correlation p of R and sr for the regression estimator, for various sample sizes, 
dravm from Test Populations A, B, and C, are given in Table C-1. 



Table C-1. Correlation of R and sr, coefficients of variation of Sr and of P, estimated from 
1000 independent samples of Test Populations A, B, and C, for various sample sizes 



Sample sizes 


Population A 


Population B 


Population C 


n 


n' 


n'/n 


A 
P 


V « 


P 


A 
P 


V « 


P 


A 
P 


V * 

^SR 


P 


2400 
1200 
880 
350 
1200 
500 
300 


360 
360 
260 
160 
180 
80 
50 


.15 
.30 
.30 
.46 

.15 
.16 
.17 


.75 
.75 
.76 
.79 

.77 
.76 

.78 


.18 
.14 
.18 
.20 

.25 
.37 
.48 


48 
29 
35 
27 
46 
45 
47 


.66 
.62 
.61 
.67 

.64 
.67 
.60 


.20 
.16 
.18 
.24 
.27 
.39 
.50 


59 
38 
35 
38 
54 
50 
51 


.68 
.66 
.68 
.71 

NA 
NA 
NA 


.27 
.22 
.26 
30 

NA 
NA 

NA 


106 
71 
71 
59 

NA 

NA 
NA 



NA - not available. 



These are estimated from 1000 independent samples for each population and for 
each sample size. As expected, for a given population, and with some sampling 
variability, the correlations are essentially constant over the various sample sizes, 
whereas the coefficients of variation of sr decrease approximately as the square root 
jf the Federal subsample size n' increzises. 



C-2 



Table of Contents 



Westat, Inc. 



Note 1 : Table C-1 also shows for each illustrative test population and 

sample size some values labeled p. These values provide another 
indicator of how much larger the variance of the variance estimates are 
than would be expected in estimating a mean from a simple random 
sample drawn from a normal population. Thus, for a simple random 
sample of n' drawn with replacement (from any distribution of a 
variable X), the relvariance of the sample estimate of the variance of 
the mean is approximately^ 



a^ 



V, = 



P-1 



(a) 

where o^ is the variance of the distribution, 
si = S(x,-x)V(n'-l)n' 



is the estimated variance of the sample mean, x, for a simple random 
sample of n' (drawn from any distribution), and 

P = I(Xi-2)Vnal 

For a normal distribution, |3 has the value 3, but may have considerably 
larger (or smaller) values for various non-normal distributions. Also, 

in general, the relvariance of Sy is approximately one-fourth of the 

2 .21 

relvariance of s_. If we substitute P for P and 4a = a in the above 

X ^ 5- 

X 

equation we obtain 



'Hansen, M.H., Hurwitz, WJ^J., and Madow, W.G. (1953), Sample Survey Methods and Theory, Vol. I, 
Chapter 10, (New York: John Wiley k Sons). Theory for samples drawn with replacement provides a 
simple approximation for samples drawn without replacement provided the sampling fraction is 
small. 



C-3 



Table of Contents 



Appendix C 



p = 4n'Vs + 1 . 

We have found it convenient, in Appendix E, to use these values of P 
in obtaining rough approximations to the variance of state estimates of 
2 

For AFDC-QC, a consequence of the large coefficient of variation of Sr 
and of the positive correlation of R and Sr is that the probability of the left tail (i.e., 
the probability that the lower confidence bound is above the value being estimated) 
is considerably less than the nominal probability; the probability of the right teiil is 
considerably greater than the nominal probability. The technical note attached to 
this Appendix shows the expected frequency below, above, and covered by 95 percent 
and 90 percent nominal confidence intervals for the case in which both R and Sr are 
normally distributed and are positively correlated, for various values of the 
coefficient of variation of sr and of the correlation of the two variables. 



Figures C-2A to C-2D are scatter diagrams showing the relationship 

A 

between the values of R and sr for the 1000 samples drawn at each of four sample 
sizes for Population A. That the correla. ^n between the variables is positive is 
clear. It is also quite clear that the joint distribution is reasonably close to normal. 
The ellipses in the diagram are such as to enclose a specified proportion of the 
p nts if the joint distribution were exactly normal. The inner ellipse would 
include 50 percent, the next would include 90 percent, the third would include 
95 percent, and the outer ellipse would include 99 percent of the points. For the 1000 
actual samples, the results were as follows: 







Sampi 


V size 




Contour 


2400/360 


1200/360 


800/260 


350/160 


.50 


491 


506 


495 


508 


.90 


901 


902 


904 


898 


.95 


957 


950 


951 


950 


.99 


990 


993 


992 


983 



C-4 



Table of Contents 



Westat, Inc. 



Thus, the observed frequencies approximate, reasonably closely, the proportions 
that are expected for the bivariate normal distribution. However, the moderate 
skewness of the marginal distribution of R and sr is evident; in each case, there are 
more points in the right hand tail of the marginal distribution than in the left hand 
tail. 

Tables C-2A, C-2B, and C-2C, which are based on 1000 independent 
samples drawn for Populations A, B, and C, respectively, show simxmary statistics of 
the current AFDC sample design. They also show some summary measures for 
specific confidence bounds and for the coverage of nominal confidence intervals 
based on the same 1000 samples. 

The panel headed "CONFIDENCE BOUNDS" gives, for example, the 

A A 

value of R such that 2.5 percent of the estimates R fall below it. This value was 
estimated from the 1000 independent samples drawn from the specified population, 
using the state and Federal sample sizes specified in the column headings of the 
table. The 5 percent, 95 percent, and 97.5 percent points were similarly estimated 
from the same samples. 

The next panel, headed "NOMINAL CONRDENCE BOUNDS," gives 
the estimated means and variances of the bounds, the bounds being computed by 
the current AFDC procedure. The line labeled "Coverage" gives the estimated 
probability that the specified tail covers the true value, R. For example, for 
Population A with the sample size 2400/360, the probability that the nominal 
2.5 percent point is greater than R is estimated to be 1.1 percent rather than the 
nominal 2.5 percent. Similarly, the probability that the nominal 97.5 percent point 
is less than R is estimated to be 5.3 percent rather than the nominal 2.5 percent. 
Consequently, the coverage of the corresponding 95 percent confidence interval is 
estimated to be 93.6 p)ercent (i.e., 100 - 1.1 - 5.3) rather than the nominal 95 percent. 

The panel of the tables that is headed "NOMINAL CONHDENCE 
BOUNDS, MINIMUM rho" gives the results of a procedure we have considered (see 
Chapter 3 of this report and Appendix D) to reduce the effect of unusually low 
values of the estimated correlation, p, between the state and Federal Hndings for the 



C-5 



Table of Contents 

-- 



Appendix C 



same case. This may happen because of sampling variation. It could also happen if 
a state, inadvertently or not, does a poor job of evaluation in its QC operation. The 
procedure consists of replacing the estimated correlation by a constant value 
whenever the estimated correlation is less than that constant value. The constant 
value used in these computations was .8. The tables show that this has only a 
minor or negligible effect on the coverage properties of the resulting confidence 
intervals. 

Table C-3 summarizes the coverage of the nominal 95 percent and 
90 percent confidence intervals for the three populations and Vcirious sample sizes. 

These results are reasonably close to expectations for samples large 
enough that both R and Sr are normally distributed, as shown in the Technical 
Note. They also conform to the general statement made above about the effect of 
the coefficient of variation of Sr and the correlation of R and Sr. As seen from 
Table C-3, the coverage of the 95 percent and 90 percent confidence intervals is 
generally somewhat less than the nominal confidence coefficient, but reasonably 
close, especially for the larger sample sizes. They may reasonably be regarded as 
providing acceptable approximations to the nominal probabilities of 95 percent and 

A 

90 percent, and therefore can serve as useful measures of the precision of R as an 
estimate of R. 

We note from Table C-3 that, for the variance estimator that imposes a 
minimvun value of p, the coverage probabilities are essentially the same as for the 
varicmce estimator that uses the estimated p, although slightly farther from the 
nominal probabilities. 

One way of circumventing or reducing the effect of the skewness of the 
distribution of R is to compute confidence intervals on a transformation of R whose 
distribution is more nearly symmetrical. If a transformation of R, say u=f(R), is 
normally distributed, and if an unbiased or consistent estimate of the standard error 
of u is available, one might have confidence bounds for the expected value of u 
whose probabilities are more nearly the nominal confidence levels. Those bounds 
could then be transformed by the inverse transformation, say g(u), to yield 



C-6 



Table of Contents 



Westat, Inc. 



confidence bounds for R with probabilities corresponding to the nonunal confidence 
levels. We therefore simulated sampling from the test populations using the 
natural logarithm transformation f(R)=ln R. 

The procedure used was the following: a sample simulating a simple 
random state sample of specified size n was drawn (with replacement) from the test 
population. A subsample simulating a Federal subsample of size n' was then drawn 
(without replacement) from the state sample. Each element of the state sample was 
assigned at random to exactly one of 90 "replicate sets." The 90 replicate sets were 
subdivided at random into 45 pairs of sets, each giving rise to a Jackknife replicate 
estimate of R. The Jackknife replicate estimate corresponding to a given pair is an 
estimate that uses the data in the state and Federal samples, but replaces a random 
one of the replicate sets in the given pair by the other replicate set of the same pair. 

Let R(j) denote the estimated payment error rate based on the i-th 
Jackknife replicate, for i=l,2,...,45. The Jackknife estimate of the variance of R is 
given by 

4 = ^j (^j) - ^)^ • 

The estimate based on the full sample and each Jackkiufe replicate estimate was 
then subjected to the logarithmic transformation: 

R» = InR 

R(.) = lnR(p. 

The Jackknife estimator of the variance of R* is then 



C-7 



Table of Contents 



Appendix C 



The confidence interval fcr the mathematical expectation of R* at a specified 
confidence level is computed as R* ± ksR» , where the multiplier k is appropriate to 
the confidence level for a normal distribution. Denote 



I^ = R*-kSjj 

I^ = R» + kSj 



and let 



L^ = exp(Lj) 

L2 = exp (Lj) . 

Then L^ and L2 are taken to be the lower and upper bounds, 
respectively, of the confidence interval for the payment error rate, R. 

For each of the four saniple sizes the procedure was repeated 400 times. 
Table C-4 shows the estimated coverage probabilities of the intervals corresponding 
to the nominal 2.5 percent point, 5 percent point, 95 percent point, and 97.5 percent 
point, as well as the estimated coverage probability corresponding to the nonninal 
90 percent confidence interval. It also shows, for comparison, the coverage of 
conf ience intervals computed by the conventional procedure described at the 
begiiming of this Appendix. 

Later, in order to obtain additional information on the validity of the 
logarithmic transformation, the procedure was repeated an additional 1500 times for 
Population A using the sample sizes n=2400, n'=360, and an additional 2000 times 
using the sample sizes n=350, n'=160. The combined results of the two sets of 
simulations are summarized in Table 2-6 of Section 2.4 of ^e report. 



C-8 



Table of Contents 



V/estat, Inc. 



Table C-2A. Population A: Summary statistics 





STATISTIC 


2400/360 


1200/360 


880/260 


350/160 




R-.0W97 










ntan R* 




0.073037 


0.072446 


0.0727088 


0.072901 


Viri«K» of K 




6.2796-05 


7.044E-05 


1.073E-04 


2.290E-04 


ntn astimatMt wiinca or R* 


6.446E-05 


7.989E-05 


1.100E-04 


2.I93E-04 


Virianc* of ostimtM vrtanct of R' 


4.937E-I0 


5.456E-10 


1.539E-09 


7.606E-09 


Mean astlmatad standard error of R* 


0.007908 


0.008845 


0.010327 


0.014513 


Varjanco of Htimatad standard arror of R* 


1.9I6E-06 


1.600E-O6 


3.311E-06 


8.S42E-06 


CONFIDENCE BOlfOS 










2.SX point 




0.050890 


0.055496 


0.052032 


0.044432 


5. OX point 




0.060931 


0.058098 


0.055404 


0.048894 


95 .OX point 




0.085944 


0.087887 


0.090412 


0.098589 


97.5X point 




0.068519 


0.090339 


0.094691 


0.105241 


NOMINAL CONFIDENCE BOUNDS 










2.SX point 


naan 


0.057537 


0.055110 


0.052467 


0.044455 




Varianca 


3.804E-05 


4.623E-05 


6.487E-05 


I.263E-04 




Covwaga 


0.011 


0.006 


0.010 


0.013 


3.0X point 


MMn 


0.060048 


0.057896 


0.055720 


0.049027 




Varianca 


4.102E-05 


4.929E-05 


6.997E-06 


1.383E-04 




Cowraga 


0.024 


0.028 


0.028 


0.031 


93 OX point 


llaan 


0.086067 


0.087000 


0.089697 


0.096775 




Varianca 


9.493E-05 


1.002E-04 


1.625E-04 


3.659E-04 




Covaraga 


0.084 


0.097 


0.100 


0.102 


975X point 


Maan 


0.068558 


0.089782 


0.092951 


0.101347 




VaHanca 


1.023E-04 


1.069E-04 


1.751E-04 


3.973E-04 




Covaraga 


0.053 


0.059 


0.066 


0.073 


NOniNAL CONFIDENCE BOUNDS. Mmntl rt» 










2.5X point 


Maan 


0.057933 


0.065389 


0.052892 


0.044950 




Varfanca 


4.037E-05 


4.744E-05 


6.767E-05 


1281E-04 




Covaraga 


0.013 


0.008 


0.014 


0.016 


S.OX pomt 


Maan 


0.060364 


0.058130 


0.056077 


0.049442 




Vrlanca 


4.322E-05 


3.043E-05 


7265E-05 


1.403E-04 




Covaraga 


0.030 


0.030 


0.032 


0.034 


9S.0X point 


Maan 


0.085751 


0.086762 


0.089341 


0096360 




Varianca 


9,0186-08 


9.774E-C5 


1363E-04 


3595E-04 




Covaraga 


0.0M 


0.098 


0.100 


0.107 


972X point 


nam 


0.088182 


0.069603 


0.092526 


0.100852 




VaHanca 


9ft37f-08 


1.03aE-04 


1.673E-04 


3.B93E-04 




Covaraga 


0.053 


0.060 


0.067 


0.075 



Note: Based on 1000 trials, for the regresaon estimate. 

C-9 



Table of Contents 



Appendix C 



Table C-2B. Population B: Summary statistics 





STATISTIC 


2400/360 


1200/360 


800/260 


350/160 




R-.079449I 










n««) R- 




0.07092S 


0.0790SS 


0.070015 


0.079299 


V«ri«K« of R" 




5.413E-05 


6J19E-05 


9.470E-05 


2.43«-04 


new Mtimattd virianct of R* 


527aE-03 


8216E-05 


1.119E-04 


2.SiaE-04 


Vwianct of tttimaUd viriaKt of R* 


4.356E-I0 


6.840E-IO 


I.710E-O9 


1.351E-00 


n«an asttmattd stmdanl Mror of R* 


0.007130 


0.000933 


0.010402 


0.015442 




1.945E-06 


1.944E-06 


3.6I9E-06 


1.321E-05 


CONFIDENCE BOUNDS 










2. SX point 




0.064904 


0.063426 


0.060174 


0.050066 


5. OX point 




0.066957 


0.065943 


0.062379 


0.054757 


9S.0X point 




0.090900 


0.094013 


0.097331 


0.100100 


97 SX point 




0.094786 


0.098049 


0.100251 


0.114120 


NOMINAL CONFIDENCE BOUNDS 










2 SX point 


liMn 


0.064949 


0.061307 


0.0S0420 


0.049033 




Varianca 


3.506E-O5 


4.662E-05 


6.630E-05 


1.402E-04 




Covwag* 


0.0 n 


0.012 


0.000 


0017 


5.0X point 


Maan 


0.067195 


0.064327 


0.061705 


0.053097 




Varianca 


3.711E-05 


4.860E-05 


6.099C-05 


1.5ftftF-04 




Covaraga 


0.032 


0.030 


0.033 


0.036 


9S.0X point 


Maan 


0.090fWW 


0.093703 


0.093926 


0.104702 




Varianca 


8.I67E-05 


9.230E-05 


1.400E-04 


4.0I6E-04 




Covaraga 


0.093 


0.072 


0.093 


0.096 


97 SX point 


riaan 


0.0*^^099 


0.096604 


0.099203 


0.109566 




Varianca 


8.8I5E-05 


9.869E-05 


1.S09E-04 


4.400E-04 




Covaraga 


0.067 


0.042 


0.055 


0.062 


NOntNAL COflOENCE BOUNDS. raNTUI rtn 










2. SX point 


Maan 


0.064057 


0.061515 


0.050461 


0.049109 




Varianca 


350«-05 


4fi66E-05 


6.654-05 


1.486f-04 




Cowraga 


0.011 


0.012 


0.009 


0.017 


5 OX point 


riaan 


0.067202 


0.064334 


0.061733 


0.053961 




Varianca 


3.711E-05 


4JXi3F-05 


6.922E-05 


1.570E-04 




Covaraga 


0.032 


0.030 


0.M3 


0.036 


95 OX point 


Maan 


0.090647 


093776 


0.095090 


0.104638 




Varianca 


B.162E-05 


9221E-05 


1.393E-04 


4.005E-04 




Covaraga 


0.093 


0.072 


0.093 


0.096 


975X point 


Maan 


0.092091 


0.096596 


0.099170 


0.109490 




Varianca 


SJXXK-OS 


9n5nF-05 


1503E-04 


4306E-04 




Covaraga 


0.067 


0.042 


0.065 


0.062 



Note; Based on 1000 trials, for the regression estimate. 

C-10 



Table of Contents 



Weatat, Inc. 



Table C-2C. Population C: Summary statistics 





STATISTIC 


2400/360 


1200/360 


000/260 


330/160 




R-.066230 










rtoan R* 




0.0AS917 


0.066014 


0.06S643 


0.066066 


VwiMK* of R' 




7.605E-O5 


8.779E-05 


1.191E-04 


2.637E-04 


n«an istjmatfd virianci or R" 


7.96«-0S 


9.78aE-05 


1.331E-04 


2.631E-04 


Varianci of tstimaUd vwianca of R* 


1.946E-0g 


2.005E-09 


5.184E-09 


2.73aE-00 


Mean •stJmaM stjrtdard trror of R* 


0.000616 


0.009636 


0.011163 


0.013317 


Varianc* oF tstJmaUd standard •rror of R* 


5.147E-06 


4.56IE-06 


8.373E-06 


2221E-05 


CONFIDENCE BOUNDS 










2.SX point 




0.049866 


0.047857 


0.044524 


0.037619 


5.0X point 




0.052507 


0.050076 


0.047661 


0.041242 


9S.0!I point 




0.081174 


0.082178 


0.085395 


0.094651 


97 .SX point 




0.084358 


0.085426 


0.080494 


0.101 151 


NOniNAL CONFIDENCE BOUNDS 










2.3X point 


n«v) 


0.049031 


0.047088 


0.043764 


0.033653 




Vananct 


4.300E-05 


5.446E-05 


6.8,TnF-05 


I.373E-04 




Cowrago 


0.003 


o.on 


0.009 


0.007 


3.0X point 


rtaan 


0.051745 


0.0S0129 


0.047281 


0.040541 




Vahanca 


4550E-O5 


5.745E-05 


7^186-05 


1.462E-04 






0.014 


0.021 


0.020 


0.020 


93.0X point 


n«w 


0.000090 


0.081890 


0.084003 


0.091591 




Varianca 


1.359E-04 


1.428E-04 


2.113E-04 


5.015E-04 




Covwago 


0.093 


0.103 


0.113 


0.120 


97291 point 


Mmh 


0.062004 


0.084940 


0.087322 


0.096479 




Varianca 


1507E-04 


1.562F-04 


2.341E-04 


5.600E-04 




Covaraga 


0.060 


0.000 


0.004 


0.007 


NornNAL comoENCE Bouros. nmrui rtn 










2.SX point 


MMn 


0.040090 


0.047749 


0.044640 


0.036712 




Vananct 


4.4B1E-05 


3.374E-05 


6.906E-03 


1.357E-04 




Covaraga 


0.010 


O.on 


0.011 


0.009 


S OX point 


naan 


0.0S2473 


0.050685 


0.040022 


0.041429 




Varianca 


4.B10E-05 


3.924E-05 


7.407E-05 


1.4«t-04 






0.029 


0.020 


0.026 


0.030 


95.0X point 


MMn 


0.079361 


0.001343 


0.0032M 


0.090703 




Vvlatca 


1.220E-04 


1336f-04 


1.957E-04 


4.748E-04 




Covanga 


0.090 


0.104 


0.116 


0.124 


975!l point 


Mao 


0.001936 


0.004278 


0.006638 


0.096420 




VviaKt 


1,T79f-04 


1.443E-04 


2.140C-O4 


52606-04 




Covwaga 


0.062 


0.004 


0.000 


0.090 



Note: Based on 1000 trials, for the regression estimate. 

C-11 



Table of Contents 



Appendix C 



Table C-3. Estimated coverage of 95 percent and 90 percent nominal confidence intervals for three test 
populations, based on alternative regression estimators using the estimated p and a 
nunimum p of .8 





Population A 


Popula 


ttionB 


Population C 


State Federal 


Estimated 


Minimum 


Estimated 


Minimum 


Estinuited 


Minimum 


n n' 


P 


P 


P 


P 


P 


P 


95 percent nominal 
confidence interval 














2400 360 

1200 360 

800 260 

350 160 


0.936 
0.935 
0.924 
0.912 


0.934 
0.932 
0.919 
0.909 


0.922 
0.946 
0.937 
0.921 


0.922 
0.946 
0.936 
0.921 


L.937 
0.909 
0.907 
0.906 


0.928 
0.905 
0.901 
0.901 


90 percent nominal 
confidence interval 














2400 360 

1200 360 

800 260 

350 160 


0.892 
0.875 
0.872 
0.867 


0.886 
0.872 
0.868 
0.859 


0.875 
0.898 
0.874 
0.868 


0.875 
0.898 
0.874 
0.868 


0.893 
0.876 
0.867 
0.852 


0373 
0.868 
0.858 
0.846 



Note: Based on 1000 independent replicate samples from each population for each sample size. The same 
replicate was used with the estimated p and the minimum p. 



C-12 



Table of Contents 



Weatat, Inc. 



Table C-4. Coverage of confidence intervals by logarithmic Jackknife, Population A 







Conventiorwl intervals 


Logarithmic transform of intervals 


Sample 






















size 


Point 


Trial #1 


Trial #2 


Trial #3 


Trial *4 


Average 


rrial«l 


rrial#2 


Trial #3 


rrial#4 


Average 


2400/360 


<.025 


.02 


.00 


.00 


.01 


.0075 


.02 


.00 


.02 


.01 


.0125 




<.05 


.03 


.00 


.02 


.01 


.0150 


.04 


.01 


.05 


.04 


.0350 




Between 


.84 


.91 


.88 


.92 


.8875 


.88 


.92 


.85 


.90 


.8875 




>.95 


.13 


.09 


.10 


.07 


.0975 


.08 


.07 


.10 


.06 


.0775 




>.975 


.06 


.06 


.07 


.05 


.0600 


.04 


.03 


.05 


.02 


.0350 


1200/360 


<.025 


.02 


.02 


.01 


.01 


.0150 


.04 


.02 


.04 


.03 


.0325 




<.05 


.05 


.03 


.04 


.03 


.0375 


.07 


.03 


.05 


.04 


.0475 




Between 


.87 


.89 


.88 


.90 


.8850 


.87 


.90 


.88 


.91 


.8900 




>.95 


.08 


.08 


.08 


.07 


.0775 


.06 


.07 


.07 


.05 


.0625 




>.975 


.06 


.05 


.06 


.03 


.0500 


.05 


.02 


.04 


.03 


.0350 


880/260 


<.025 


.01 


.00 


.01 


.03 


.0125 


.01 


.00 


.03 


.07 


.0275 




<.05 


.01 


.00 


.04 


.09 


.0350 


.04 


.05 


.08 


.10 


.0675 




Between 


.87 


.93 


.91 


M 


.8875 


.85 


.90 


&7 


35 


.8675 




>.95 


.12 


.07 


.05 


ai 


.0775 


.11 


.05 


.05 


.05 


.0650 




>.975 


.11 


.05 


.05 


.06 


.0675 


.06 


.04 


.04 


.02 


.0400 


350/160 


<.025 


.00 


.01 


.01 


.01 


.0075 


.02 


.01 


.04 


.04 


.0275 




<.05 


.02 


.02 


.04 


.04 


.0300 


.04 


.03 


.06 


.08 


.0525 




Between 


.90 


.85 


.86 


.85 


.8650 


.91 


.87 


.89 


.85 


.8800 




>.95 


.08 


.13 


.10 


.11 


.1050 


.05 


.10 


.05 


.07 


.0675 




>.975 


.05 


.10 


.06 


.08 


.0725 


.01 


.04 


.02 


.03 


.0250 



Note: Each trial used 100 repetitions, and each repetition used 45 replicates. 



C-13 



Table of Contents 



Appendix C 



Figure C-1 . Distribution of estimated standard error 



Frequency 



300 -• 
250 '■ 

200 -■ 
150 - 

100 ■■ 

50 -• 



, /., B 






I ,r' '. 






■. D 



Simple sizes 

n n' 

A 2400 360 

B 1200 360 

C 880 260 

D 350 160 



-^ — r*^ — H 1 1 1 1 1 ^1 — =*t t 1 •■•1 ( 1 1 1 1 F=^A 

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.^ 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 



Estimated standard error 



C-14 



Table of Contents 



Figure C-2A. Scatteq>Iot for Population A - sample size 1 



O 

I— I 
en 



.01G-- 
.015- 
.014- 
.013- 

UJ.01p.. 

< 

rt -011 

1^-010 
'^.009 + 
in .008 
.007-- 
.006- 
.005 ■ 



.004 



.04 




.07 .08 .09 

ESTIMATED ERROR RATE 



.10 



.11 



.ID 



Table of Contents 



Figure C-2B. Scatterplot for Population A sample size 2 



n 

1— ' 



-BIBt 
.015- 
.014- 
.013- 
it^.012- 

pr •011- 

n-010- 

h 

.009- 

i/i.00e- 

.007- 
.006- 
.005- 



.004 



.04 




.05 



.06 



-+- 



t- 



-»- 



.07 .08 . .03 

ESTIMATED ERROR RATE 



.10 



.11 



.12 



Table of Contents 



Figure C-2C. Scatterplot for Population A - sample size 3 



n 




.06 
ESTIMATED ERROR RATE 



Table of Contents 



.04 



Figure C-2D. Scalterplot for Population A - sample size 4 




.05 



.06 



.07 .00 

ESTIMATED ERROR RATE 



03 



.10 



,11 



.13 



Table of Contents 



Weatat, Inc. 



TECHNICAL NOTE FOR APPENDIX G 
A Note on Confidence Intervals 



If a simple random sample of sizen is drawn from a normal 
distribution, the mean of the population may be estimated by 



X = Zxj/n 



and the variance of x by 



2 

s_ = 2(xi-x)2/(n-l)n 



If X denotes the population mean, the statistic (x-X)/Sx has the Student t 
distribution so that a confidence interval vsdth confidence coefficient a is given by 

x±t<a)Sx 

where t(a) is taken from the Student t distribution or from the normal distribution 
if n is large (say n>30). 

Even when the conditions given above are not satisHed, the confidence 
interval is often estimated in the same w^ay, on the assun\ption that since the 

distribution of x is approximately normal for a large sample, the procedure ensures 

that the probability that the interval will cover the population mean X is 

approximately a. It is often assumed that the probability that X is below (or above) 
the interval is approximately (l-a)/2. The fact, however, is that for samples drawn 

from skewed distributions the statistics x and s^ are correlated and consequently the 
probability that X is below the interval is not necessarily equal to the probability that 
X is above the interval. Actually, in sampling from skewed distributions, the joint 
distribution of x and Sx may approach normality reasonably closely for samples of 
moderate size, but x and sj remain correlated, and the correlation remains about 



C-19 



Table of Contents 



1^. 



Appendix C 



the same as sample size increases. Also, the variance of s^ may be much greater 
than if sampling from a normal distribution. We evaluate the probabilities 
associated with 90 percent and 95 percent nominal confidence intervals for this case, 

i.e., X and Sx are jointly normally distributed but correlated ^ d with various 
possible values of the coefficient of correlation depending on the skewness of the 
distribution from which the sample was drawn. 

Suppose that a variable u has the normal distribution with mean \i and 
vr- ance a , and that a variable s as a normal distribution with mean a c.ad 
variance tt, and that the correlation ot u and s is p. Let k be a constant and define 
the upper and lower bounds of a confidence inten, al by 

^ = u±ks. 

The variable ^ is normally distributed, with 

E(^) = n±ko 

Var(^) = Var(u) + k^VarCs) ± 2k Cov(u,s) 

= a^ + k^T^ ± 2kpot 

= V2,say. 
We wish to evaluate 

Prob (^^) = (vVw^ r\i exp {-(x-n+ka)2/2v2} dx . 



f 



Let 



y = (x-nTko)/V 



so that 



X = Vy + n + ko 

dx = Vdy 



C-20 



Table of Contents 



Westat, Inc. 



and 



Prob (^l^) = (VW^ f+ka/V exp (-y2/2) dy , 



We may define 



z2 = ^= Wk2(l)2±2kp(l) 
so that we may write 

Prob(^^) = (Vw^ r^/^ exp(-y2/2)dy. 

Note that x/o is the cx)efficient of variation of s. 

The probability that the lower bound of the confidence interval is 
greater than \i is thus 



1 - (^|2Ky^ rk/z exp i-y^/l) dy 



and the probability that the upper bound is less than [i is 
(Vw^ f-k/z exp (-y2/2) dy . 



r 



We may Ccdl these the coverage probabilities of the lower and upper 
"tails" of the confidence interval, respectively. 

In Table C-5 we show the values of these probabilities for the nominal 
95 percent confidence interval (in which case one takes k = 1.96) and for the nominal 
90 percent confidence interval (in which case one takes k = 1.645). The 
computations are shown for various combinations of p (in the column headed 



C-21 



Table of Contents 



Appendix C 



"Rho") and t/a (in the column headed "CV(s)", the coefficient of variation of the 
estimated standard error). The coverage probability of the confidence interval itself 
is simply the complement of the sum of the coverage probabilities of the tails. In 
each of the columns headed "Bias" we show the difference between the nominal 
probability and the actual probability. Note that this follows the statistical 
convention of showing the estimate (taken to be the nominal probability) minus the 
value being estimated (taken to be the true probability of the tail). 

To illustrate, consider a case in which p = .7 and CV(s) = .1. For a 
nominal 95 percent confidence interval, the probability that the value being 
estimated is in the lower tail (i.e., the lower boimd is greater than the true value) is 
.0125 and the probability that the value being estimated is in the upper tail (i.e., the 
upper bound is less than the true value) is .0436. Since the nomiiud probabilities are 
both .025, the biases are, respectively, .025 - .0125 = .0125 and .025 - .0436 = -.0186. 

The relevance of this discussion to the AFDC-QC sample estimates is 
that the estimated error rate R and the estimated standard error s^ are 
approximately jointly normally distributed, but with positive correlations (these 
positive correlations are essentially constant for all sample sizes from a given 

A 

population). Thus, R and Sr are (approximately) examples of the variables u and s 
in the above analysis. The coverage probabilities read from Table C-5 are reasonably 
consistent with those estimated from simulated sampling from the test populations 
as displayed in Table 2-4, for the estimated values of p and the coefficient of 
variation Vj^ given in Table C-1. The tail probabilities of the tails of the nominal 

confidence intervals, as given by simulated sampling from the test populations with 
various sample sizes, are compared in Table C-6. 



C-22 



Table of Contents 



Table C-5. Bias of nominal coverage probabilities, for samples from a skewed distribution* 



n 



Rho 


CV(s) 






95% Qjnfidence Interval; 


i 








90% Confidence Intervals 






LOWIT 


tail 


Interval 


Uppei 


rtail 


Lower tail 


Interval 


Uppe 


rtail 


Prob. 


Bias 


Prob. 1 


Bias 


Prob. I 


Bias 


Prob. 1 


Bias 


Prob. 1 


Bias 


Prob. 


Bias 


0.9 


0.5 .0000 


.0250 


.8451 


.1049 


.1549 


-.1299 .0001' 


.0499 


.8226 


.0774 


.1773 


-.1273 


0.9 


0.4 .0000 


.0250 


.8701 


.0799 


.1299 


-.1049 .0005 


.0495 


.8449 


.0551 


.1546 


-.1046 


0.9 


0.3 .0001 


.0249 


.8968 


.0532 


.1031 


-.0781 .0029 


.0471 


.8672 


.0328 


.1299 


-.0799 


0.9 


0.2 .0017 


.0233 


.9230 


.0270 


.0753 


-.0503 .0110 


.0390 


.8854 


.0146 


.1036 


-.0536 


0.9 


0.1 .0090 


.0160 


.9428 


.0072 


.0483 


-.0233 .0272 


.0228 


.8965 


.0035 


.0763 


-.0263 


0.9 


0.08 .0115 


.0135 


.9453 


.0047 


.0432 


-.0182 .0313 


.0187 


.8978 


.0022 


.0709 


-.0209 


0.9 


0.06 .0143 


.0107 


.9474 


.0026 


.0383 


-.0133 .0357 


.0143 


.8988 


.0012 


.0656 


-.0156 


0.9 


0.04 .0175 


.0075 


.9488 


.0012 


.0336 


-.0086 .0403 


.0097 


.8995 


.0005 


.0603 


-.0103 


0.9 


0.02 .0211 


.0039 


.9497 


.0003 


.0292 


-.0042 .0450 


.0050 


.8999 


.0001 


.0551 


-.0051 


0.8 


0.5 .0009 


.0241 


.8507 


.0993 


.1484 


-.1234 .0031 


.0469 


.8261 


.0739 


.1708 


-.1208 


0.8 


0.4 .0005 


.0245 


.8758 


.0742 


.1236 


-.0986 .0038 


.0462 


.8478 


.0522 


.1484 


-.0984 


0.8 


0.3 .0010 


.0240 


.9015 


.0485 


.0975 


-.0725 .0073 


.0427 


.8684 


.0316 


.1243 


-.0743 


0.8 


0.2 .0035 


.0215 


.9256 


.0244 


.0710 


-.0460 .0155 


.0345 


.8854 


.0146 


.0991 


-.0491 


0.8 


0.1 .0107 


.0143 


.9434 


.0066 


.0459 


-.0209 .0299 


.0201 


.8963 


.0037 


.0738 


-.0238 


0.8 


0.08 .0129 


.0121 


.9457 


.0043 


.0413 


-.0163 .0335 


.0165 


.8976 


.0024 


.0688 


-.0188 


0.8 


0.06 .0155 


.0095 


.9476 


.0024 


.0369 


-.0119 .0373 


.0127 


.8987 


.0013 


.0640 


-.0140 


0.8 


0.04 .0184 


.0066 


.9489 


.0011 


.0327 


-.0077 .0414 


.0086 


.8994 


.0006 


.0592 


-.0092 


0.8 


0.02 .0215 


.0035 


.9497 


.0003 


.0287 


-.0037 .0456 


.0044 


.8999 


.0001 


.0545 


-.0045 


0.7 


0.5 .0053 


.0197 


.8532 


.0968 


.1415 


-.1165 .0116 


.0384 


.8244 


.0756 


.1640 


-.1140 


0.7 


0.4 .0032 


.0218 


.8798 


.0702 


.1170 


-.0920 .0107 


.0393 


.8474 


.0526 


.1418 


-.0918 


0.7 


0.3 .0033 


.0217 


.9050 


.0450 


.0916 


-.0666 .0135 


.0365 


.8681 


.0319 


.1185 


-.0685 


0.7 


0.2 .0059 


.0191 


.9276 


.0224 


.0665 


-.0415 .0205 


.0295 


.8850 


.0150 


.0945 


-.0445 


0.7 


0.1 .0125 


.0125 


.9440 


.0060 


.0436 


-.0186 .0327 


.0173 


.8%1 


.0039 


.0712 


-.0212 


0.7 


0.08 .0145 


.0105 


.9461 


.0039 


.0394 


-.0144 .0358 


.0142 


.8975 


.0025 


.0667 


-.0167 


0.7 


0.06 .0167 


.0083 


.9478 


.0022 


.0355 


-.0105 .0390 


.0110 


.8986 


.0014 


.0623 


-.0123 


0.7 


0.04 .0192 


.0058 


.9490 


.0010 


.0318 


-.0068 .0425 


.0075 


.8994 


.0006 


.0581 


-.0081 


0.7 


0.02 .0220 


.0030 


.9498 


.0002 


.0283 


-.0033 




.0462 


.0038 


.8999 


.0001 


.0540 


-.0040 



•(Based on a model in which x and s- have a bivariate normal distribution with correlation p.) 



Table C-5. Bias of nominal coverage probabilities, for samples from a skewed distribution* (continued) 



Table of Contents 



Rho 



CV(s) 



95% Confidence Intervals 


Lower tail 


Interval 


Upjjer tail 


Prob. Bias 


Prob. 1 Bias 


Prob. 1 Bias 



90% Confidence Intervals 



Lower tail 



Prob. I Bias 



Interval 



Prob. I Bias 



Upper tail 



Prob. I Bias 



n 



0.6 


0.5 


.0134 


.0116 


.8523 


.0977 


.1342 


-.1092 


.0238 


.0262 


.8195 


.0805 


.1567 


-.1067 


0.6 


0.4 


.0085 


.0165 


.8814 


.0686 


.1101 


-.0851 


.0201 


.0299 


.8449 


.0551 


.1349 


-.0849 


0.6 


0.3 


.0071 


.0179 


.9073 


.0427 


.0856 


-.0606 


.0208 


.0292 


.8669 


.0331 


.1124 


-.0624 


0.6 


0.2 


.0089 


.0161 


.9291 


.0209 


.0620 


-.0370 


.0257 


.0243 


.8844 


.0156 


.0898 


-.0398 


0.6 


0.1 


.0144 


.0106 


.9444 


.0056 


.0412 


-.0162 


.0355 


.0145 


.8960 


.0040 


.0686 


-.0186 


0.6 


0.08 


.0161 


.0089 


.9464 


.0036 


.0376 


-.0126 


.0380 


.0120 


.8974 


.0026 


.0646 


-.0146 


0.6 


0.06 


.0179 


.0071 


.9480 


.0020 


.0341 


-.0091 


.0407 


.0093 


.8986 


.0014 


.0607 


-.0107 


0.6 


0.04 


.0201 


.0049 


.9491 


.0009 


.0308 


-.0058 


.0436 


.0064 


.8994 


.0006 


.0570 


-.0070 


0.6 


0.02 


.0224 


.0026 


.9498 


.0002 


.0278 


-.0028 


.0467 


.0033 


.8999 


.0001 


.0534 


-.0034 


05 


0.5 


.0239 


.0011 


.8496 


.1004 


.1265 


-.1015 


.0375 


.0125 


.8134 


.0866 


.1490 


-.0990 


05 


0.4 


.0158 


.0092 


.8814 


.0686 


.1028 


-.0778 


.0308 


.0192 


.8415 


.0585 


.1276 


-.0776 


05 


0.3 


.0122 


.0128 


.9085 


.0415 


.0793 


-.0543 


.0288 


.0212 


.8653 


.0347 


.1060 


-.0560 


05 


0.2 


.0124 


.0126 


.9302 


.0198 


.0575 


-.0325 


.0312 


.0188 


.8838 


.0162 


.0850 


-.0350 


05 


0.1 


.0164 


.0086 


.9448 


.0052 


.0389 


-.0139 


.0383 


0117 


.8958 


.0042 


.0659 


-.0159 


05 


0.08 


.0177 


.0073 


.9466 


.0034 


.0357 


-.0107 


.0402 


.0098 


.8973 


.0027 


.0624 


-.0124 


05 


0.06 


.0192 


.0058 


.9481 


.0019 


.0327 


-.0077 


.0424 


.0076 


.8985 


.0015 


.0591 


-.0091 


05 


0.04 


.0209 


.0041 


.9492 


.0008 


.0299 


-.0049 


.0448 


.0052 


.8994 


.0006 


.0559 


-.0059 


05 


0.02 


.0229 


.0021 


.9498 


.0002 


.0273 


-.0023 


.0473 


.0027 


.8999 


.0001 


.0529 


-.0029 


0.4 


0.5 


.0354 


-.0104 


.8462 


.1038 


.1184 


-.0934 


.0516 


-.0016 


.8076 


.0924 


.1408 


-.0908 


0.4 


0.4 


.0243 


.0007 


.8805 


.0695 


.0953 


-.0703 


.0420 


.0080 


.8380 


.0620 


.1200 


-.0700 


0.4 


0.3 


.0181 


.0069 


.9090 


.0410 


.0729 


-.0479 


.0371 


.0129 


.8636 


.0364 


.0994 


-.0494 


0.4 


0.2 


.0162 


.0088 


.9309 


.0191 


.0528 


-.0278 


.0368 


.0132 


.8832 


.0168 


.0801 


-.0301 


0.4 


0.1 


.0184 


.0066 


.9451 


.0049 


.0365 


-.0115 


.0411 


.0089 


.8957 


.0043 


.0632 


-.0132 


0.4 


0.08 


.0194 


.0056 


.9468 


.0032 


.0338 


-.0088 


.0425 


.0075 


.8972 


.0028 


.0603 


-.0103 


0.4 


0.06 


.0205 


.0045 


.9482 


.0018 


.0313 


-.0063 


.0441 


.0059 


.8985 


.0015 


.0574 


-.0074 


0.4 


0.04 


.0218 


.0032 


.9492 


.0008 


.0290 


-.0040 


.0459 


.0041 


.8993 


.0007 


.0548 


-.0048 


0.4 


0.02 


.0233 


.0017 


.9498 


.0002 


.0269 


-.0019 


.0478 


.0022 


.8999 


.0001 


.0523 


-.0023 



'(Based on a model in which x and s- have a bivariate normal distribution with correlation p.) 



Table C-5. Bias of nominal coverage probabilities, for samples from a skewed distribution* (continued) 



Table of Contents 



Rho 



CV(s) 



95% Confidence Intervals | 


Lower tail 


Interval 


UpF)er tail 


Prob. Bias 


Prob. 1 Bias 


Prob. Bias 



90% Confidence Intervals 



Lower tail 



Prob. I Bias 



Interval 



Prob. I Bias 



Upper tail 



Prob. I Bias 



0.3 


0.5 


.0472 


-.0222 


.8431 


.1069 


.1098 


-.0848 


.0652 


-.0152 


.8027 


.0973 


.1321 


-.0821 


0.3 


0.4 


.0335 


-.0085 


.8792 


.0708 


.0873 


-.0623 


.0532 


-.0032 


.8349 


.0651 


.1118 


-.0618 


0.3 


0.3 


.0246 


.0004 


.9091 


.0409 


.0663 


-.0413 


.0455 


.0045 


.8620 


.0380 


.0925 


-.0425 


0.3 


0.2 


.0204 


.0046 


.9314 


.0186 


.0481 


-.0231 


.0424 


.0076 


.8826 


.0174 


.0750 


-.0250 


0.3 


0.1 


.0205 


.0045 


.9453 


.0047 


.0342 


-.0092 


.0439 


.0061 


.8956 


.0044 


.0605 


-.0105 


0.3 


0.08 


.0211 


.0039 


.9470 


.0030 


.0319 


-.0069 


.0447 


.0053 


.8972 


.0028 


.0581 


-.0081 


0.3 


0.06 


.0218 


.0032 


.9483 


.0017 


.0299 


-.0049 


.0458 


.0042 


.8984 


.0016 


.0558 


-.0058 


0.3 


0.04 


.0227 


.0023 


.9492 


.0008 


.0281 


-.0031 


.0470 


.0030 


.8993 


.0007 


.0537 


-.0037 


0.3 


0.02 


.0237 


.0013 


.9498 


.0002 


.0264 


-.0014 


.0484 


.0016 


.8999 


.0001 


.0517 


-.0017 



2 



*(Based on a model in which x and s- have a bivariale normal distribution with correlation p.) 



Table of Contents 



Appendix C 



Table C-6. Tail coverages as estimated by simulation and as given by the normal model 





Rho 


CV(s) 


95% Confidence Interval 


90% Confidence Interval 




Lov/er tail 


Uppe 


rtail 


Lower tail 


Uppe 


rtail 


Sample size 


Estimated 


Modeled 


Estimated 


Modeled 


Estimated 


Modeled 


Estimated 


Modeled 


Population A 
2400/3^'^ 


0.75 


0.18 


0.011 


0.006 


0.053 


0.064 


0.024 


0^X20 


0.084 


0.092 


1200/3- : 


0.75 


0.14 


0.006 


0.008 


0.059 


0.054 


0.028 


0.025 


0.097 


0.0.S2 


880/260 


0.76 


0.18 


0.010 


0.005 


0.066 


0.064 


0.028 


0.020 


0.100 


0.092 


350/160 


0.79 


0.20 


0.013 


0.0M 


0.075 


0.071 


0.031 


0D16 


0.102 


0.099 


Population E 
2400/360 


0.66 


0.20 


0.011 


0.007 


0.067 


0.065 


0.032 


0JO23 


0.093 


0.093 


1200/350 


0.62 


0.16 


0.012 


0.010 


0.042 


0.054 


0.030 


0.028 


0.072 


0.082 


880/260 


0.61 


0.18 


0.008 


0.009 


0.055 


0.058 


0.033 


0.027 


0.093 


0.086 


350/160 


0.67 


0.24 


0.017 


0.005 


0.062 


0.075 


0.036 


0.019 


0.0% 


0.103 


Population C 
2400/360 


0.68 


0.27 


0.003 


0.004 


0.060 


0.083 


0.014 


ome 


0.093 


0.110 


1200/350 


0.66 


0.22 


0.011 


0.006 


0.080 


0.070 


0.021 


om\ 


0.103 


0.097 


880/260 


0.68 


0.26 


0.009 


0.005 


0.084 


0.080 


0.020 


0.017 


0.113 


0.108 


350/160 


0.71 


0.30 


0.007 


0.003 


0.087 


0.092 


0.028 


0.013 


0.120 


0.119 



C-26 



Table of Contents 



APPENDIX D 



RELIABILITY OF LOWER CONFIDENCE BOUNDS 



D.l 



Variances of Lower Confidence Bounds and Point Estimates Compared 



The estimated variances and standard errors of the regression estimate 
of R and of the lower bound of the confidence interval, based on 1000 independent 
replicates sampled from each test population, for each of several sample sizes, are 
shown in Table D-1. In this analysis, the lower confidence bound, L, has been 
computed at the 95 (or 5) percent nominal confidence level, i.e., L = R - tsA with 

t = 1.645. From the table, it can be seen that the estimated variances of the lower 

confidence bounds (s^) vary from about one- third to two-thirds as large as the 

2 
variances of the estimated payment error rates (s*), depending on the state sample 

size and the fraction in the Federal subsample. The standard errors of L vary from 

A 

about 60 to 80 percent of the standard error of R. 



Table D-1. Variances and standard errors of 95 percent lower confidence lx)unds and of estimated 
paynient error rates, for regression estimator, for three test populations for seven 
illustrative sample sizes 



Sample size 


Population A 


Population B 


Population C 








2 2 




2 ,2 




2 2 




n 


n' 


n'/n 


Sl/s^ 


\/^ 


Sl/Sr 


\/^ 


^/^ 


\/^ 


2400 


360 


.15 


.65 


.81 


.69 


.83 


.60 


.77 


1200 


360 


.30 


.70 


.84 


.75 


.86 


.65 


.81 


880 


260 


.30 


.65 


.81 


.73 


.85 


.61 


.78 


350 


160 


.46 


.60 


.78 


.64 


.80 


.55 


.74 


1200 


180 


.15 


.40 


.64 


n/a 


n/a 


n/a 


n/a 


500 


80 


.16 


.36 


.60 


n/a 


n/a 


n/a 


n/a 


300 


50 


.17 


.32 


.56 


n/a 


n/a 


n/a 


n/a 



D-1 



Table of Contents 



Appendix D 



These results are both surprising and interesting. They are iai different 
from what would occiu" in estimating a mean and computing confidence intervals 
from a simple random sample from approximately normal distributions. They 
would also have desirable implications for AFDC if lov-er confidence bounds were 
to be used in determining disallowances. The relatively smaller variances of L occur 
because R and sr are r itively correlated. Consequently, if R is high, then sr tends 
also to be high and the computed lower bound is, on the average, lower than it 

A 

would be if the standard error of R were known and used to compute it, and vice 
ver .. On the other h i, in Su .pling from a normal distribution, the estimated 
mean and its estimatea standard error are uncorrelated cmd there is no such 
compensation in the computed lower conficence bound, and the variance of the 
computed lower confidence bound would be larger than the variance of the mean. 

The estimated correlations observed in the sets of 1000 replicates for 
various sample sizes from the three test populations are summarized in Table C-1 
in Appendix C, and «ire seen to be quite high (of the order of .6 to .8). They vary 
trivially with sample size, and this variation apparently is due primarily to 

sampling variability. 

To provide additional insight, since the nominal 95 percent lower 
confidence bound is 

L = R - 1.645 s^, 
it follows that the variance of L is 

7 2 7 

Ol = Oj^ + (1.645)^05 A - 2(1645) p og Og A 



where p is the correlation of L and sr 



D-2 



Table of Contents 



Westat, Inc. 



The first term in Ol is the variance of R; the second term is the 

contribution from the variance of the estimated s^*^ the standard error of R; and the 
third term is determined by p, the correlation of R and Sr. Some estimates of a^ and 

2 

Ogg based on the 1000 replicates are given in Tables C-2A, B, and C, and are 

summarized in Table C-1 in Appendix C. Estimates of p are also given in Table C-1. 

The variance of the lower confidence bound for the regression estimator can be 

2 
obtained by making the appropriate substitutions in the above equation for Ol- The 

results agree closely with the values given in Table D-1, which were obtained by 

computing the variance of L directly from the 1000 replicates. 

The implication of these results, as stated earlier (Section 2.5.2), is that 
the lower confidence bound computed by use of the estimated standard error of R 
from the sample is a substantially more stable and better way to compute the lower 
confidence bound than would be obtained if the unknown true value of the 
standard error were in fact known and used in computing the lower confidence 
limit. 



D2 Use of Minimum Correlation in Computing Lower Confidence Bound 

to Control Possible Lower Quality of SUte QC 

It has been recognized at OF A, and is a source of concern, that if a lower 
confidence bound is used in computing disallowances, a state could achieve a 
considerably lower average disallowance simply by doing a lower-quality QC job, 
and thereby yielding a lower correlation between the Federal review results and the 
state QC results. This effect can be seen by examiiung the role of r (the correlation) 
in Equation (3), Chapter 1. While it may or may not be likely that this would occur 
in practice, there is a concern that it might, since the higher the quality of the work 
done on QC in a state, the higher the correlation, and, as a consequence, the higher 
the lower confidence bound and the higher the disallowance. 



D-3 



Table of Contents 

-- 



Appendix D 



There is a simple solution to this potential problem. The procedure is 
to identify those states for which r, the estimated correlation between the state and 
Federal QC results, is less than fl, where fl is, perhaps, the 30th percentile of the 
state estimates of r for the prior year; that is, Fl is the value such that 30 percent of 
the observed state correlations of state and Federal payment errors in the prior year 
are below il, and 70 percent are above. An acceptable variant of this procedure is to 
substitute a constant value for Tl tha. would approximate the 30 percent rule. The 
constant can be chosen based on recent prior experience. We would expect that for 
many or most states for which the estimated correlation is below rL, the low 
correlation will occur primarily because of sampling variability. The procedure is to 
substitute rL for r in Equation (3) of Chapter 1 in estimating the vaFiance of R 
whenever r is less than rL- The principal gain from this procedure is that it removes 
or reduces ar\y gain that could result if a state did poorer-quality QC work in order to 
reduce disallowances. An additional minor advantage is that it slightly reduces the 
variance of the lower confidence bounds, at the cost of a slight downward bias in the 
variance estimate. 

We illustrate the application of this procedure as follows. Suppose the 
"30 percent" rule is adopted, and that rL = .80 is the 30th percentile of the state 
correlations for the prior year. Suppose that for a particular state n' = 360 and 
n = 2400, and the observed correlation is .50. This relatively low correlation might 
arise either because the state QC reviewers have done poor work (whether 
purposefully or not), or because of random variation, or some of both. The ratio of 
the computed standard error of R with .50 substituted for r in Equation (3) to the 
standard error if .80 is substituted is 1.31. Thus, the use of the standard error 
computed with rL = -80 substituted for r will substantially raise the lower confidence 
limit. 

Table I>2 shows the distribution of the estimated state correlations for 
each fiscal year from 1981 to 1984 for the 44 states that did not treat the QC samples as 
stratified samples in making sample estimates in any of the four yeaFS. Figure D-1 
shows the cumulative distribution of the coFFelations foF each year foF the same 
44 states. FiguFes D-2A, D-2B, and D-2C show the cumulative distFibution of the 
estimated correlation, based on the 1000 independent samples from each of the Test 



D-4 



Table of Contents 



Vfeatat, Inc. 



Populations A, B, and C, respectively, for each sample size. It will be noted that in 
each of these three figures the two distributions for which the Federal subsample 
size n' is 360 are nearly indistinguishable. 

Figures D-3A, D-3B, and D-3C illustrate for Test Populations A, B, and C 
the reductions in variance that result from the application of the 30th percentile 
rule where all correlations are estimated from samples of the same population. 
Note that in these figures the curves based on the estimated and the minimum 
correlations are almost indistinguishable. When they overlap, only one is shown. 

We note that whether or not the rule of substituting r^^ for r is applied 

in computing the standard error of R for a state, the value of R is based entirely on 

the sample for the state, and the computation of R is tmaffected by the substitution 
of r^^ for r. Also, while the use of the minimum correlation rule makes a substantial 

difference in the variance estimates for individual states for which the estimated 
correlation is low, it only moderately reduces the estimated variance over all 
possible samples that could be drawn. This is dearly illustrated by Figures D-3A, D- 
3B, and D-3C. 

We note another important point in connection with the possible use 
of lower confidence bounds for assessing disallowances. This is that the lower 
confidence bound, and consequently the expected disallowance, would average 
lower for a relatively small them for a relatively large size of QC sample. This could 
create an incentive for a state with a relatively high error rate to use smaller QC 
samples just to reduce the potential for disallowances, even though it would be 
undesirable from the point of view of corrective action and other uses of the quality 
control sample, as well as from the Federal goal of achieving an acceptable return 
from disallowances. Consequently, it would be necessary, if a lower confidence 
bound approach were adopted, to specify minimum sample sizes, and these minima 
should not be so small as to unreasonably lower the expected lower confidence 
bounds. Of course, relatively larger samples will also better serve the basic role for 
which QC was created, i.e., providing guidance for improved AFDC design, and for 
taking corrective action to improve administration. This issue of desired 
(optimum) size of QC sample for computing disallowance is briefly considered in 
Section 3.4 and in Appendix G. 



D-5 



Table of Contents 



Appendix D 



Table D-2. Distribution of states by the estimated correlation between state and Federal findings for 
fiscal years 1981-1984, for 44 states 





Fiscal years 




Estimated 












correlation 


1981 


1982 


1 ''3 


1984 


Ail years 


.40 - .49 





4 








4 


50 - .59 


7 


3 


1 





11 


.60 - .69 


3 


2 


3 


2 


10 


.70 - .74 


2 


3 


4 





9 


.75 - .79 


5 


6 


5 


4 


20 


.80 - .84 


6 


7 


5 


5 


23 


.85 - .89 


3 


7 


5 


9 


24 


.90 - .94 


7 


5 


11 


12 


35 


.95 - .99 


9 


:; 


10 


10 


34 


1.00 


2 


^ 





2 


6 


Totals 


44 


44 


44 


44 


176 


Median 


.846 


.837 


.881 


.905 


.875 


30th percentile 


.760 


.780 


.782 


.870 


.791 



Note: The correlations are tallied only for the states that did not use stratified samples. 



D-6 



Table of Contents 



We $ tat. Inc. 



Figure D-1. Cumulative distribution of estimated correlation for 44 states 



Cumutetlv* 
frtgiMncy 
45 T 




0.4 



0.5 OA 0.7 O.e 0.9 

Estimitoii correlation of 9t«U and fedarat findings 



,1983 
I 
-1984 

I 
'1981 
. I 
1982 



D-7 



Table of Contents 



Appendix D 



Figure D-2A. Cumulative distribution of the estimated correlation. Population A 



POPULATION A 




■«"'Fiy°i—<'*W°?°°'Hfffl^=^^^ I I I I I > I I I I I I I I I I I ) I I I I I I I I I I I I I t t 

0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 
51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 



Estimatsd corrdation of stat* and fadarai findinga 



D-8 



Table of Contents 



Weatat, Inc. 



Figure D-2B. Cumulative distribution of the estimated correlation. Population B 



POPULATION B 




n'«l60 



0. 
51 



0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0, 0, 0. 0. 0. 0. 0. 0. 0. 0. 
53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 



Estimatad corr«4atk)n of ttat* and fadaraJ findings 



D-9 



Table of Contents 



Appendix D 



Figure D-2C. Cumulative distribution of the estimated correlation. Population C 



POPULATION C 



n'-360 




0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 
51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 



Estimated correlation of sUte and federal findings 



D-IO 



Table of Contents 



Figure D-3A. Cumulative distribution of the nominal 95 percent lower confidence bounds of the payment error rate using (A) the 
estimated correlation, and (B) the minimum correlation rule, in the regression estimate of variance, for Population A 
(based on independent simulations of 1000 samples) 



Cumulative 
frequency 



1000 ■ 


_^JPl* 


^IDss-Sftt. 


. Vi 


900 - 


/M 






800 - 


H* too 






700 ■ 


/^ 






600 - 


:* jfoo 






500 ■ 


;s £ ad 






400 - 


:m Aap 

■ /P/ 

K Zap 






300 ■ 


M Aa6 
% A a o 












200 - 


■ A la p 
« Aap 
:m' itf p p 






100 • 
■ 


1 1 1 1 1 — 


1 1 


1 1 1 



Based on: 



Samplaslze 
2400-360 
1200-360 
880-260 
350-160 



Estimated 
correlation 

A 



-A- 
-X- 



Minimum 
correlation 

B 
-O- 
-O- 
-A- 
-»- 



The symtx)ls for the estimated correlation 
do not show when closely overlapped by 
those for the minimum correlation. 



0.02 0.03 0.04 0.05 



0.06 0.07 
Lower bourxi 



0.08 0.09 0.1 



0.11 



c 



Table of Contents 



Figure D-3B. Cumulative distribution of the nominal 95 percent lower C( nee bou. ut the payment error rate using (A) the 

estimated correlation, and (B) the minimum correlation rule, m the regression estimate of variance, for Population B 
(based on independent simulations of 1000 samples) 



Cumulative 
frequency 



^ ^ 



s' Aap 
X lap 

^ AD p 



9 

I—* 
IsJ 




^ 



0.03 0.04 0.05 0.06 0.07 

Lower bound 



0.08 



0.09 



—i 
0.1 



Based on: 



Sample size 

360 

360 

880-260 

350-160 



Esdmeted 
oorrelelion 

A 



-A- 

-X- 



Mkiimum 
correlation 

B 
-O- 

-a- 
-A- 
-*- 



The symtxMs lor ttM estimaled correiation 
do not show when doaely owartapped by 
ttiose lor the minimum correlation. 



Table of Contents 



Figure D-3C. Cumulative distribution of the nominal 95 percent lower confidence bounds of the payment error rate using (A) the 
estimated correlation, and (B) the minimum correlation rule, in the regression estimate of variance, for Population C 
(based on independent simulations of 1000 samples) 



Cumulative 
frequency 



1000 T 



i-a-« 



acSdt 



9 




moo 



0.02 



0.03 



0.04 0.05 006 

Lower bound 



0.07 



0.08 



0.09 



Sample size 
2400-360 
1200-360 
880-260 
350-160 



Estimated 
correlation 

A 



-A- 
-X 



Based on: 

Minimum 
oorrelaiion 

B 
-O- 

-a- 

-A- 

-*- 



The symtx>ls for the estimated oorrsiation 
do not show when dosely ovariapped by 
those for Itie minimum correlation. 



Table of Contents 

-- 



APPENDIX E 

EXPLORATION OF SOME ALTERNATIVE PROCEDURES 
FOR COMPUTING POOLED VARIANCE ESTIMATES 



El Introduction 

The current practice in the AFDC quality control program is to estimate 
the variance of the overpayment error rate for each state using only the data 
provided by the sample for that state in the current period. It seems likely that the 
mean square error of the estimated variance could be reduced by somehow making 
use of additional data. The additional data might be: 

(a) Data for the same state for prior periods; or 

(b) Data for other (presumably similar) states. 

We refer to variance estimation procedures that utilize data from prior time periods 
or from other states as pooled variance estimation procedures. 

Three principal uses for an estimated variance of an estimated 
overpayment error rate are: 

(1) To provide a general measure of precision of the estimated 
overpayment error rate. Examples of this are to indicate the 
approximate magnitude of the sampling variability of an 
estimated overpayment error rate, or to compare the precision of 
estimates for different states, or to compare the precision of 
different eillocations of the sampling effort to the state sample 
and to the Federal subsample for a state. 

(2) To provide a lower confidence boimd for an overpayment error 
rate. Consideration has been given to the use of a lower 
confidence bound in various ways in the computation of 
disallowances, as discussed, for example, in Chapters of this 
report. 



E-l 



Table of Contents 



Appendix E 



(3) To predict for a future year, the sampling errors that would 
result from specific sizes of Federal and state samples for a state, 
or alternatively, to determine for a futvire year, the approximate 
sample sizes needed to achieve a specified level of precision. 

The pooled variance estimation procedures that we discuss in this 
appendix will be especially useful for purposes (1) and (3). We have already shown 
in Sections 2.3 and 2.4 that for purpose (2) the direct estimate of variance based only 
on data for the current year for a state, presumably (but not necessarily) using a 
transformed Jackknife variance estimator, is a preferred procedure for computing 
lower confidence bounds. As discussed in Section Z5, such a procedure provides a 
more stable lower confidence boiwd than would the use of the unknown true 
variance of the overpayment error rate, even if it were known, or than would result 
from the use of a pooled variance estimate. 

In this appendix, we provide descriptions and approximate evaluations 
of some alternative procedures for pooled unit variance estimation. 



E2 Variance Estimates Using Data for the Same State for Prior Periods 

Alternative (a) mentioned in the introduction to this appendix 
suggests the possibility of using the regression of the unit variance (defined as the 
estimated variance of the estimated overpayment error rate multiplied by the 
Federal subsample size) on other current and recent past data for the same state. We 
tested this procedure by using the data for the 50 states and the District of Columbia 
for the four six-month periods in fiscal years 1981 and 1982. The regression was 
estimated from the data for the first three of the four periods. The regressor 
(independent) variables were: 

• The estimated overpayment error rate for period 3; 

• The estimated overpayn\ent error rate for period 2; 

• The estimated overpayment error rate for period 1; 

• The estimated unit standard error for period 2; and 



E-2 



Table of Contents 



Westat, Inc. 



* The estimated unit standard error for period 1. 

The regressand (dependent) variable was the estimated unit variance for period 3. 
No weights were used in computing the regression. The estimated multiple 
correlation was .87, indicating that about three-fourths of the variance of the 
estimated unit variances in period 3 among the states was explained by the 
regression. It may be seen from the Technical Note at the end of this Appendix that 
the independent variables for period 1 made trivial contributions to the regression 
estimates. 

Of course, the predictive value of a regression equation appears to be 
higher for the data used in computing the regression coefficients than will be the 
case when tested with an independent sample from the same population. An 
independent sample for the same period is not available. However, a useful test of 
the effectiveness of the regression procedure is to apply it to data for a succeeding 
period. Consequently, an estimate of the variance for each state was computed for 
period 4 by applying the regression coefficients that had been computed for period 3. 
The regressor variables were now the estimated overpayment rates for periods 4, 3, 
and 2, and the estimated unit variances for periods 3 and 2. For period 4, the 
estimated multiple correlation was .68, indicating that about one-half of the 
variance among the states was explained by the regression. Figure E-1 illustrates, 
with scatter charts, the relationship of the direct and regression estimates of the unit 
variances, for states, for both periods 3 and 4. Table E-2 in the Technical Note for 
Appendix E shows, by states, the values of the dependent and independent variables 
used in the regression, as well as the unit variances estimated from the regression 
for periods 3 and 4. 

We note that if a predicted value were a perfect prediction of the true 
unit variance for a state, the correlation between the predicted and the direct 
variance estimate could not be high if the direct estimates are subject to large 
variances, as indeed they are. Nevertheless, if a prediction method based on 
independent data yields a higher correlation with the direct estimates than does a 
different prediction method, also based on independent data, the higher correlation 
is evidence of the greater precision of that method. We also note that since this 



E-3 



Table of Contents 



Appendix E 



particular regression approach involved the use of the estimated error rate for the 
current period as an independent variable, the result is a higher correlatior and a 
higher fraction of variance explained than would be the case if the current error rate 
were not used as an independent variable. Moreover, since the independent 
variables used in the regression predictions are subject to large variances, we 
believe, without further evaluation, that this regression approach for utilizing prior 
years' data provides a less promising prediction method than the alternatives we 
discuss below, which employ pooled variance estimates across a considerable 
number of states. 



E3 Pooled Variance Estimates for Groups of States 

Alternative (b) mentioned in the introd ^tion suggests the possibility 
of using a composite estimator of the variance, that is, a weighted mean of the direct 
estimate for the state and the average of the estimates for some group of states that 
are similar to the given state in the sense that their average unit variance or recent 
prior periods was approximately the same. The weights would be chosen so as to 
minimize, so far as feasible, within each group of states, :he mean square error of 
each estimated state unit variance ""o experiment with this idea, the groups were 
determined by sequencing the states according to the average value -^f the estimated 
unit variance in fiscal years 1981 and 1982. Composite variance esnmates for fiscal 
year 1984 were 'o be made using these groups. We note that we use data for Hscal 
years 1981 and ^82 to group states for making variance estimates for Bscal year 1984. 
In practice, the prior years' data might or might not be available for such a grouping. 
Later, we test the method by examining how well the pooled variance estimates for 
fiscal year 1984 serve as predictors of the variances for 1983. It would have been 
desirable to use 1985 data (which were not available). Consequently, 1983 serves as a 
proxy for 1985. 

Figure E-2 shows the average unit variance for the states, arranged 
according to the value of the average imit vanance in 1981 and 1982, as well as the 
groups that were deftned. 



E-4 



Table of Contents 



West at. Inc. 



On the basis of this graph: 

The first group was defined to consist of the first 10 states; 
The second group consists of the 11th through the 21st state; 
The third group consists of the 22nd through the 31st state; 
The fourth group consists of the 32nd through the 41st state; and 
The fifth group consists of the 42nd through the 51st state. 

The states assigned to each of the five groups can be seen by referring to 
Table £-3, where the states are ordered by group, with a space between groups. For 
each state, the composite estimate was the weighted mean of the direct estimate of 
the unit variance for the state and the weighted average of these estimates for the 
other states in its group, under the condition that the other states had a Federal 
subsampling rate the same as that of the specified state. 

Each group of states was then used to make a p>ooled uiut variance 
estimate for the current period for each of the included states. The pooled variance 
estimate for state k within a group is made by taking a weighted average of the 
current unit variance estimate for the particular state (state k) and the pooled unit 
estimate for the other states in the group. More specifically, the pooled unit 
variance estimate for state k is obtained by computing the weighted average 

Sk = WkSk + (l-W^)Sok, 



*5 A 

where Sj^ is the estimated unit variance of Rj^ (computed as in the present AFDC 

2 
procedure) for the current period for state k, s^^j^ is the weighted average (weighted by 

the Federal subsample size) of the unit variance estimates for the current period for 
the other states in the group (excluding state k). In this computation for state k, the 
unit variance estimate for each of the other states is modified by replacing its Federal 
subsampling rate by the rate used for state k. 



E-5 



Table of Contents 



1^. 



Appendix E 



This pooled estimate will considerably improve the unit variance 

estimate for state k provided that the true and unknown unit variance in each of 

2 
the other states in the group is not too different from Sj^, the true (unknown) unit 

2 
variance for state k. The improvement results because s j^ is estimated from a much 

2 2 2 

larger sample of cases than is Sj^. Of course, s^j^ is, in fact, a biased estimate of Sj^, the 

2 2 

bias depending on how much the expected value of s^,^ differs from S^. The weight 

wi^ for state k can be chosen, as described in the Technical Note to Appendix E, so as 

2 2 

approxin\ately to minimize the mean square error of Sj^ as an estimate of S. , taking 

account of approximate measures of the bias as well as the variances involved. 

We note, especially, as seen in the Technical Note, that in order to 

compute approximately optimum values of wj^ for a state, estimates are needed of 

2 2 

the unit variance for each state, as well as of the bias of s . as an estimate of Sj^. Of 

course, we do not know the values of these terms and must estimate them. We 
have used approximate procedures to do this, as discussed in the Technical Note. In 
particular, the bias could be estimated directly for each state, but such estimates are 
subject to variances that are too large to be useful. Consequently, we examine the 
implications of some alternative procedures for determining an approximately 
optimum w^. 

As seen in the Technical Note, the estimates of the average squared 
bias were negative for four of the five groups, and positive for one. While the true 
squared bias must be zero or positive, negative estimates are possible. These 
estimates, even the average for a group of about 10 states, are still subject to very 
large sampling errors. Of course, the negative estimates are the result of sampling 
error, and we regard the positive ones as also substantially determined by sampling 
variability. Consequently, we have used two different measures of bias that resvilt in 
two sets of approximately "optimum" weights. For one set, we used an estimate of 
zero bias for each state. As another alternative, we use for each state a high average 



E-6 



Table of Contents 



West at. Inc. 



squared bias estimate obtained as the average of the absolute values of the estimated 
squared biases of the five groups. 

The manner in which the weights in the composite estimator were 
determined, so as approximately to minimize the average mean square error for the 
states in the group, is detailed in the Technical Note at the end of this app)endix. 

Tables E-3 and E-4 display, for the alternative estimates of optimum 
weights, the composite estimate of the unit variance in fiscal year 1984 for each of 
the 50 states and the District of Columbia. The tables also show for each state, 
among other things, the size of the Federal subsample (n'), the weight used in the 
composite estimator, the direct estimate of the unit variance, the vciriances of the 
estimated average variance in the group and of the direct variance estimate, and the 
variance of the composite estimate of the unit variance. The definitions and the 
estimation procedures are given in the Technical Note. 

In addition, as a fourth and simpler alternative pooled variance 
estimation procedure, we have made pooled estimates of the unit variance of the 

2 _ 

Federal overpayment errors, s^^, of the average payment error, t, and of the 

estimated correlation of the Federal and the state determinations of overpayment 
errors, r. These estimates were pooled over all states in the group. The simple 
pooled unit variance estimate for a state is then 



2 

s^ 



72 



{l-r2(l-ff)) 



where fj = n'j/nj is the subsampling fraction for the Federal subsample in the state. 
This procedure provides what we refer to as a simple pooled variance estimate, and 
is similar but not equivalent to the assumption of zero bias in the computation of 
optimal weights. Table E-5 displays the simple pooled estimates of imit variances. 



E-7 



Table of Contents 



■^ 



Appendix E 



In an effort to evaluate the two alternative composite variance 
estimators, we have made approximate estimates of their variances. We refer to 
these estimates of the variance of the estimated variances as "experimental" 
estimates. This term has been used because we have not made these estimates 

directly from the sample data. Instead, as discussed in the Technical Note, we have 

2 
derived them from the assumption that the relvariance of the direct estimate, s^, of 

IX 

A 

the variance of R, the regression estimator for a state from a double sample, can be 
approxinnated by 

J2 =-i±(4)2 . (1) 



2 2 

The vedue of o^ is estimated directly from the sample data by s^. Approximate 

values for p are derived from the estimates of the variance of variances that have 
been obtained from the 1000 replicated samples from each of the three test 
populations, for various sizes of state samples, n, and of Federal subsamples, n'. 

We did not make direct analytic estimates of the variance of the 
variance of the regression estimator for a double sample because the theory is not 
available. We did not regard it as worth the effort to develop the theory at this time 
because we believe our "experimental" estimates provide an acceptable alternative, 
and perhaps a better alternative than direct estimates which would be subject to very 
large variances. 

The estimated values of P are shown in Table C-1 and are also 
disoissed in the Technical Note in Appendix C. A linear regression on the Federal 

subsampling rate was fitted to these values of P and used to compute approximate 

values of p for each state. These are displayed in Tables E-3 and E-4. These and the 
estimated unit variances were then substituted in Equation (1) above to compute the 
"experimental" values of the variance of the estimated unit variance for each state. 
The variances of the composite estimate of unit variances were derived from these, 
as explained in the appended Technical Note. 



E-8 



Table of Contents 



West at, Inc. 



We now present two kinds of evaluations of the pooled variance 
estimators. From Figure E-3 (each point represents the ratio for a state), it is seen 
that the ratio of the estimated variance of the direct estimate to the estimated 
variance of the composite estimate with zero as the estimate of bias squared varies 
from an average of approximately a factor of 14 (varying from about 12 to 17) for 
states with annual Federal subsamples of 150 to an average of approximately 8 
(varying from about 6 to 10) for states with a Federal subsample size of 
approximately 360. Thus, the variance of the composite estimate using zero as the 
squared bias is small, very substantially below that of the direct estimate of the 
variance. 

The simple pooled variance estimator yields results that are very dose 
to those for the composite estimator using zero as the squared bias, so the variance 
reductions for the simple pooled variance estimator are similar to those shown in 
Figiore E-3 for the "zero bias" estimator. In fact, it is shown in the Technical Note 
that the correlation, across states, of the simple pooled variance estimates with those 
from the composite estimate using zero bias squared is approximately .98. This 
correlation is high enough that we regzird it as not worthwhile to make a separate 
evaluation of the variances of the simple pooled variance estimator. 

We note that while the reductions in the variance of the variance 
estimates are substantial for ail Federal subsample sizes, they are greatest for the 
states in which the Federal subsample is relatively small, and in which reductions 
in the variance of the variance estimates are most needed. We also note that these 
results are based on the approximate experimental variance of variance estimates, as 
discussed earlier. However, since these results depend importantly on the sample 
sizes involved, the ratios displayed in Figure E-3 should be reasonably dose to what 
they would be if the true variances of the variance estimates were known. 

Figure E-3 also displays the ratios of the variance of the direct variance 
estimate to the variance of the comp)osite variance estimate using the high estimate 
oi the squared bias. The resemblance of the simple pooled estimator to the 
composite estimator using zero squared bias is a consequence of the similarity in the 
weights assigned to the direct estimate of the unit variance in these two estimators. 

E-9 



Table of Contents 



Appendix E 



In F: .e E-4, we show the weight assigned to each state for each of four 
estimators of the variance of the estimated imit variance. (The estimator designated 
"adjusted simple pooling" is described in Section 2.5.1 of this report.) In this figure, 
e states are arranged in order of the weights assigned in the simple pooling. We 
ite that the weig"^ ts are nearly identical for the simple pooled estimator and the 
composite estimator using zero squared bias. The weights for the composite 
estimator u^ ig the "high" squared bias are much greater, and therefore, result in 
less variance redi :tion. Consequently, from the point of aw of variance 
reduction, there is a considerable advantage in using the zero bias squared in the 
composite estir itor versus the alternative high bias squared estin^ator that we have 
evaluated. The adjusted pooled estimator assigns weights that are slightly less than 
twice those assigned by the simple pooled estimator. 

The next point to uate is how well the direct e.' nate of the uixit 
variance, and each of the pooled vziriance estimates, serves as an estimate of the 
unknown true unit variance for each state. We caimot make this evaluation 
directly but can do it indirectly. We have shown in the Technical Note that, 
without knowing the true varieinces for 1983, we can approximate the correlation, 
across states, between the true state unit variances for 1983 and the variance 
estimates for 1984, tor each variance estimation procedure. 

Table E-1 summarizes the indicated estimated coefficients of 
correlations (r), and their squares (r"), called coefficients of determination, obtained 
as described i" the Technical Note. These are estimated unweighted correlations 
across states — a small state and a large one have equal weights. 



E-10 



Table of Contents 



Westat, Inc. 



Table E-1. Estimated unweighted correlations of true unit variance of R for 1983 with estimated unit 
variances for 1984 





Estimated 

coefficient of 

correlation 

r 


Estimated 
coefBcient of 
determination 


Estimated unweighted correlation of true unit 






variance of R for 1983, with: 






(a) Direct variance estimate for 1984 


.64 


.41 


(b) Composite variance estimate for 1984 using 
zero squared bias 


.69 


.47 


(c) Composite variance estimate for 1984 using 
high squared bias 


.75 


.57 


(d) Simple composite variaiu:e estimate for 1984 


.69 


.47 



These correlations are reasonably high, although not as high as would 
be desirable. About half of the unweighted variance between states of the true unit 
variance is accounted for by each of the three pooled variance estimators, indicated 
by the squared correlations. The correlations for the pooled variance estimators are 
somewhat higher than the correlation for the direct variance estimation (although 
this may result from sampling variability). This fact, together with the fact that their 
variances are very much smaUer, is sufficient to indicate the substantial advantages 
of using a pooled variance estimator for general precision measures, for predicting 
needed sample sizes, or for predicting the precision to be obtained from specified 
sample sizes in a future year. 

We note that it would be desirable, also, to estimate the correlations of 
the 1984 true state unit variances with the various 1984 variance estimators. We are 
not able to do this because we do not have independent direct variance estimates for 
1984. Nevertheless, it is obvious that the correlations of 1984 true unit variances 
with the 1984 variance estimates would be higher than those shown in Table E-1. 



E-ll 



Table of Contents 



■^ 



Appendix E 



On the evidence presented, it appears that the simple pooled variance 
estimator might reasonably be regarded as the preferred one among the three 
estimators we have evaluated. Since this estimator is almost identical to the 
composite estimator using zero squared bias, the gauis in variance reduction will be 
substantial, as indicate^ by Figure E-3. Its estimated correlation with the 1983 true 
values is lower th an that of the composite variance estimator with the high squared 
bias. The gain in correlation with the latter (which may be real or the result of 
sampling error) seen ^ not to be worth the substantial additional computation 
complexity involved in computing the cor -^osite variance estimates. The simple 
pooled variance also has the advantage of providing separate estimates of the 

2 -7 

variance components in the regression estimator (i.e., s^ / t and r) for use in 
evaluating alternate allocations to the state sample and the Federal subsample. 

It is possible that, on further analysis, an estimator intermediate 
between the simple pooled variance estimator and the composite estimator with 
high bias squared would be found to have additional advantages. We have 
described such an alternative in Section 2.5, and the weights assigned by such an 
estimator are shown in Figure E-4. It seems likely that it would have minor 
advantages over the simple pooled variance estimator as defined and evaluated 
here. When data for an additional year become available, such a modified simple 
pooled variance estimator may reasonably be evaluated in comparison with those 
shown here. 

We conclude, then, that for the present, the simple pooled variance 
estimator (or the modifications of it, as described in Section 2.5.1 of the report) is to 
be preferred for most variance estimation purpose^ other than for the computation 
of lower confidence bounds. The advantages, for these purposes, over the direct 
variance estimator are substantial 



E-12 



Table of Contents 



Westat, Inc. 



Figure E-1. Scatter charts illustrating the relationship between the direct estimate of variance and 
the estimate based on the regression, for states, for periods 3 and 4 



Regression ^ 

estimate 

of unit 604 

variance 

for period 3 




Direct estimate of imit variance for period 3 



Regression OO 

estimate 

of unit 50 - ■ 

variance 




20 <«) 60 80 100 

Direct estimate of unit variance for period 4 



t<40 



E-13 



Table of Contents 



Appendix E 



Figure E-2. Average unit variance in FY 1981-82, for states arranged by that average 



Unit 



vnaKi 



0.05 



CM 



0.03 



0.02 



0.01 



Group I 
"1-10- 



■n-21 




X^^^^ 



I I I I I I I I I I I M I I I I I I r I I I I t I I I I t I >> I 1 > I I I I I t I I t I I I I i I 
I 3 5 7 9 II 13 IS 17 19 21 23 25 27 29 31 33 a 37 S 41 45 « 47 « SI 

Stjte 



E-14 



Table of Contents 



Weatat, Inc. 



Figiire E-3. Ratio of the variance of the direct estiinate of unit variance to the composite estimate of 
unit variance using zero and high squared bias, related to the size of the Federal 
subsample 



Ratio of 


variancet 


18 1 




16 ■ 




14 • 




12 . 




10 . 




8 - 




6 




4 




2' 





•«• 






«, 






% • 



o^o ^ o 



8 o^o 






*^-n% 



® cooo 



• Zero bias 
O High bias 



100 130 200 250 300 350 

Federal subaample size 



400 



— I 
450 



E-15 



Table of Contents 



Appendix E 



Figure E-4. State weights for pooled unit variance estimates, for states sequenced by weight for the 
simple pooled estimate 



Weight 

9 T 

8 ■• 

7 •• 

6 ■■ 

0,5 •■ 

4 ■■ 

3 •• 

2 9^ 
1 



♦• 2*ro squared bus 

o- Hiqh squared bias 

■• Simple Doolino 

'-'■ Adjusted simple pooling 



r^-^-A/l.Jv 



pO 




-t— < — I — I — t— I — I I I I I — I — I — (— ( — I — ►— ^ 



1 



I I I t I I I I > I I I I ) I ( I I I I I t I t I I I 

1 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 

State sequence number, by weight for the simple pooled estirr^ e 



E-16 



Table of Contents 



Westat, Inc. 



TECHNICAL NOTE FOR APPENDIX E 



This note gives details on the computations referred to in Appendix E. 

Regression Estimator of the Unit Standard Error of the Payment Error 
Rate 

We are concerned here with the regression of the unit variance 
(defined as the variance of the estimate of the payment error rate, multiplied by the 
Federal sample size) on the estimated error rates in the same and the previous two 
six-month periods and on the unit variances in the previous two periods. In matrix 
notation, we wish to fit the model 

y = Xp + e 

where X is a matrix of 51 rows (the 51 states) and six colunms (corresponding to the 
constant term and the five regressor variables as defined in Appendix E), and y is the 
colunrm vector of the tmit variances. We have estimated the regression coefficient 
vector p by (unweighted) least squares, namely by 

b = (X^Xy'^X^y . 

The computations were made using the data for the first three of the 
foiu- periods available, yielding the following solution for the vector b: 

-0.0(X)5 Constant term 

0.2873 Pajmient error rate, period 3 

-0.0090 Payment error rate, period 2 

-0.0033 Pa3ni\ait error rate, period 1 

0.2941 Unit standard error, period 2 

-0.0000 Unit standard error, period 1 

The regression estimates of the unit variance in period 3 varied among the 51 states 
from to 0.069, with a mean value of 0.020 and a standard deviation of 0.013. 



E-17 



Table of Contents 

-- 



Appendix E 



Table E-2 gives the data and the results of the regression value of the 
unit variance for period 3 as well as the calculated unit variance for period 4 using 
the coefficients given above. The regression estimate of the unit variance in 
period 4 varied among the states from to 0.058, with a mean value of 0.019 and a 
standard deviation of 0.012. 



Composite Estimator of the Unit Variance 

We consider first the general problem in which a composite estimate x j 

for the i-th locality of a group of localities is a weighted mean of a local unbiased 
estimate Xj and the mean of the estimates Xj of the other localities that are members 
of the same group. Let m denote the number of localities in the group. The 
composite estimator for the i-th locality is defined by 

X. = Wjx. + (1-Wj) X ^j) (1) 

where X(j) denotes the mean of the estimates for the m-1 localities other than the 
i-th locality. We wish to determine the weight Wj that minimizes the mean square 
error of the composite estimator. We have 

MSE(x j) = Var(x .) + (Ex . - Ex.)^ (2) 

2 



= wfox. + (l-Wj)2ox^.j + (1-W.)2 (Ex-(.j - Exj)2 . 



The value of Wj that minimizes the mean square error is obtained by equating to 
zero the derivative of the mean square error with respect to Wj: 

2 2 

= 2W.Ox. - 2(1-Wj) [a- + (BC(.j- Ex.)2} . 

' "(i) 

Solving this equation for Wj yields the optimum value 



E-18 



Table of Contents 



Y/eatat, Inc. 



Wj = '^ J . (3) 

2 2/- \ 

0.^+0- + (Ex,., - Ex. ) 



The parameters in the equation for the optimum Wj are not known, so 
that estimates of them are used to obtain an estimate of the optimum weight. 

2 
In our case, x, is the estimated unit variance Sj of the estimated 

A 

payment error rate Rj for state i. We make the assumption 

4 = (Pi-^)<'f/"'i <4) 



2 
where p^ is a specified constant for each state i eind a- is the unit variance that is 

2 
estimated by s. . This relationship would hold for simple remdom sampling with 

replacement.* For the regression estimator with double sampling, as used in AFDC, 
it is an approximate relationship. The specified pj for each state are shown in 

Tables E-3 and E-4. The values of pj were computed from the observed relationship 

2 2 
of Pi (as given by the approximation Pj = 1 + n'iSx./s^)/ that is yielded by 

Equation (4), to the Federal subsampling rates n'/n in the Test Populations A, B, and 
C. A linear regression equation was fitted to the data shown in Table C-1 in 
Appendix C. The dependent variable was the ^i shown in the table, and the 
independent variable was fj^n't/nj. The resulting regression equation was 

Pi = 64.3-54.47fi. 



^Hansen, M.H., Hurwitz, VfN., and Madow, W.G., Sample Survey MethotU and Theory, Vol. I, p. 427 
(New York: John WUey & Sons, 1953) 



E-19 



Table of Contents 

-- 



Appendix E 



We then define the estimator 

s^^. = (Pj - 1) {u.(l - (l-n'./n.)T\ )}2/n'. (5) 

where uj denotes the ratio of the estimated variance of the Federal determination of 
the overpayment errors to the square of the average payment error estimated from 
the state sample, and rj denotes the estimated correlation between the Federal and 
state determinations of the overpayment errors. The expression in the braces 
divided by n'j is the appropriate regression variance estimator of the payment error 
rate, Rj, as used by AFDC. 

Groups of states were defined in the following way. For each state i, for 
each six-month period t in fiscal year 1981-82, the unit variance was computed as 

where the u^j and r^j are defined as uj and r, in Equation (5). This computation of 
the unit variance replaces the Federal subsampling rate that was used for the state by 
the constant rate .2, which is roughly the average Federal subsiimpling rate. The 
average unit variance for state i in fiscal year 1981-82 was then taken as the weighted 
mean of the four six-month periods, viz., 

4 4 

sf = I n- . s^ / I n' . 
t=l t-l 

where n'^ denotes the Federal sample size in period t. The states were ordered by 

2 
the value of Sj and five groups were defined as exhibited in Figure E-2. 



variance is 



For the set of states in a group other than the state i, the average 



X(i) = I n'.u.[l-(l-n'./n.)r^J/(n'-n'j) (6) 

j^i ' ' ' 



E-20 



Table of Contents 



West at, Inc. 



whose variance is estimated by 

2 
s- = X n'XL -, .-, 



= X. n'j(Pj-l) [u. [1 - (l-n'j/n.)r .] V/(n'-n\) 



2l \2/(^,^.\2 



(i) ]^l 



(7) 



The term {Ex(i)-Exi}^ in Equation (3) is the square of the bias that results 
when the average variance for the other states in a group is used as an estimate of 
the variance for state i in the group. 



To estimate (Ex.jj-Ex.) , we note that 



E(X(i) - x,)2 = E{(x ^., - EX(.^) - (x. - Exj) + (Ex" ., - Ex.)}^ 

2 2 

= <L + Ox + (EX(. J - Ex.)2 



'(i) 



since x^ and X(i) are independent. An unbiased estimate of the desired parameter, 
termed the square of the bias, is then given by 



"(i) ' 



This could be computed directly for each state, but such estimates are subject to 
extremely large variances, too large to be useful. Iiistead, we consider, as a first 
alternative, using for each state the average squared bias for the whole group of 
states. We would therefore estimate this parameter for a group by 



b' = ViKxfXV - Sx. - S- } 



(8) 



E-21 



Table of Contents 

-- 



Appendix E 



Although the parameter being estimated is non-negative, the estimate b^ may take 
on a negative value for a group. In such a case, b^ may be taken to have the value 
zero for the group. As a second alternative, because even the group averages are 
subject to wide sampling variation, the values of b^ may be taken to be the average 
over all the groups. Even the average may be negative, in which case we may take 
b2=0. Substituting the estimates of the parameters in Equation (3), we obtain the 

estimates of Wj. 

A further modification is suggested by the fact that the first term in the 

2 
denominator of Wj, namely o^., is subject to a quite large variance. We therefore 

2 2 

consider replacing the estimate s by a more stable estimate of a^. in the following 

way. We first replace the quantity within the braces in Equation (5) by the average of 

such quantities for the other states in the same group; the latter is given by X(i) of 
Equation (6). We then define the more stable estimator as the weighted mean of the 
new variance computed by Equation (5) and the direct estimate of variance for the 
state. Thus, we have 



a2 

S = 



(n'-n;)[(P,-l)/n.-]x,)-.n:s^^^ 



2 
This is then substituted for a^ in Equation (3). 



The various parameters as discussed above were estimated for each 
state from the state data for fiscal year 1984, based on the groups of states as defined 
above and displayed in Hgiire E-2 and Tables E-3 and E-4. The average value of b^ 
turned out to be negative. Table E-3 gives the composite estimates when b^ is taken 
to be zero for each group. Table E-4 gives the composite estimates when b^ is taken 
to be the weighted mean of the absolute values of the value of b^ computed for each 
group. We refer to this as the "high" squared bias, because it is likely to be greater 
than the true squared bias (since its expected value is greater). In addition to the 
composite estimate of the uiut variance for each state. Tables E-3 and E-4 display the 



E-22 



Table of Contents 



Weatat, Inc. 



size of the Federal subsample, n', the values of p (beta), the estimated error rate, the 
weight used in the composite estimator, and the experimental estimate (described 
below) of the variance of the estimated unit variances for both the direct estimate 
and the composite estimate. 

The weight calctilated for a state is considerably greater when the 
"high" squared bias is used than when a zero squared bias is used. The true 
optimum weight is somewhere between the two, since the true squared bias is 
undoubtedly positive. Figure E-5 is a scatter diagram which shows the relationship 
of the weights for zero and high squared bias. We note that, on the average, the 
weight is about four times higher when the high squared bieis is used. Figiire E-4 
also shows this relationship. 

Because the composite estimator involves considerable computation, 
we consider also a simple pooled estimate of the unit variance. Groups of states are 
defined as above for the composite estimator. For state i of group g, the simple 
pooled estimator of the imit variance is given by 



4{.-(.-^)d 



-2 *■ ^ n . ' * -* 

t 8' 

i 

In this expression, rig^ and n'g^ denote the sizes of the state sample and the Federal 
subsample, respectively. The other quantities are w^ghted means of corresponding 
quantities for all states in the group. Specifically, 



2 
8* 



2 



W 



?(nv-l)s^g,,/(n-g-mg) 
1f(n'g,-l)s^y/(n-mg) 
t ("'gi - 1) sgixy / (n'g - mg) 



E-23 



Table of Contents 



Appendix E 



m = number of states in group g 

2 
s.^ = estimated unit variance of the Federal determination of 

payment error 
2 

s . = estimated unit variance of the state determination of 
payment error, as estimated from the Federal subsample 

s . = estimated unit covariance of the Federal and state 
determinations of pa3rment error. 

Table E-5 displays the simple pooled estimates for each state. These 
closely resemble the composite estimates using zero squared bias, as exhibited in 
Figvu-e E-6. The correlation between the two state estimates is .978. On the average, 
the simple pooled estimate is about 10 percent greater than the composite estimate. 

The variance of an estimate of the unit variance for a state is a function 
of the size of the sample used to estimate the unit variance. In Figure E-3 we show, 
by state, the ratio of the direct estimate for fiscal year 1984 to the composite estimate 
(using zero squared bias and the high squared bias) as related to the Federal sample 
size. The relationship, as expected, appears to be a monotone decreasing function of 
the sample size, concave upward, and somewhat flatter when using the high 
squared bias. 

An important reason for seeking a better estimate of the true imit 
variance in a given year is to predict the imit variance in a subsequent year, for the 
purpose of determining the sample sizes that will yield estimates of the payment 
error rate of some prescribed precision. In the discussion above, we have used data 
for fiscal years 1981 and 1982 to group states, and have then estimated unit variance 



E-24 



Table of Contents 



Weatat, Inc. 



for 1984. In practice, we would estimate the unit variance for 1983 and use it to 
predict the unit variance for 1984. Since this should be similar to "predicting" 1983 
from the 1984 estimates, we present such analyses here. Figure E-7 presents scatter 
diagrams showing the relationships of the several 1984 estimates of unit variance to 
the direct estimate for 1983. As shown in Figure E-7, each of the estimates for 1984 
shows a moderate correlation with 1983, of about .5 (ranging from .44 to .52). 

To evaluate the 1984 pooled variance estimator as a predictor of the 
1983 variance, let 

Xjj as direct estimate of unit variance for state i in year t, where 
t=3 for fiscal year 1983 and t=4 for fiscal year 1984; 

z.| = pooled estimate of unit variance; 
Xjj = true unit variance; and 
Zj. = expected value of z^. 

We are interested in the correlation, over states, between the direct estimate for 1984 
and the true unit variance for 1983, and the correlation between each of the pooled 
unit variance estimates for 1984 and the true unit variance for 1983. We denote 
these correlation coefficients by Px^x* *'^*^ ^ziXg' respectively. We define 

AX4. = x^j-X^j 

X^ » average of X^. across states 

X^ = average of X^. across states. 
The covariance of x^. and Xgj across states is defined by 



E-25 



Ox^ = EE{(x4.-X/li} 



= EE{(X4. + Ax4.-X4)2li} 



= EE{(X4.-X/}+EE{(Ax4j)2|i} 



E(X4.-x/-hE{a^^li} 



Since 



we have 



2 2 
= % + ^'ax^ ' say- 



E(VX3)2 = c^ 



Table of Contents 



Appendix E 



= EE{(X4. + Ax4j-X4)(X3j-X3)li} 
= EEKX^i-X^KXgj-Xg)!!} 

= %X3 • 
The variance of x^j across states is defined by 



E-26 



Table of Contents 



Weatat, Inc. 



PX4X3 - 



1C4X3 



^X4% 



X4X3 



•^pv^ 



'X4X3 



^^a/^^'^ax/^ 



PX4X3 

Similarly, it can be shown that 
PZ4X3 = PZ4X3 



-Ji + <Jax/% 



-y/^^^V^ 



(10) 



(11) 



None of the correlations between the values X,, X^ and Z^ can be 

estimated directly from the data. We can, however, estimate the correlations of 
their estimates, and similar algebraic manipulation shows that 



2 



PX4X3 



^'AXgV • ^Ax. 






<^x. 



(12) 



E-27 



Table of Contents 



1^ 



Appendix E 



2 2 1 

PX3Z4 = PX3Z4 2 r~ ■ ^^^^ 



\ 0x3/ \ Oz^/ 



Solving Equation (12) for Px^Xq *^^ substituting into Equation (10), we 
obtain 



'X4A3 




Px.X, = A / 1 + ■^- Px,x. (14) 



Similarly, solving Equation (13) for Px-z ^^^ substituting into 
Equation (11), we obtain 



V 



""Axa 



^^4X3 = A / ^ "■ ~2- PX3Z4 . (15) 



It is necessary to estimate the quantities in these equations. We have 
1 51 



1 51 



V _ V \\2 



^ Z E{(X4.-X4.) + (X4J-X4) + (X4-X4)} 



1 51 
"sT ? {E(x4i.X4.)2 + (X4.-X/ + E(x4-X4)2 

-2E(X4-X4)(x4.-X4j)} 



E-28 



Table of Contents 



We a tat. Inc. 



i" ? (%^(X4i-X/-E(3^,-X/} 



2 2 _ , 

«Jax^ +ax -E(x4-X/ 



We ignore the third term of the right member since it is small compared to the first 
term. Similarly, we have 

51 



^4r } (^4i-=^/= ^A24+% 



From Table E-3, we compute the estimates of the quantities involved: 
51 



-^ I (X4i-X4)2 = 1.2228x10^ 



Sa. = 6.4484 xl0"5 



*Ax. 



so that 



s^ = 5.7796 X 10'^ 



and 



Sax, 

-Y = 11157. 

We assume that this ratio has the same value for 1983 as for 1984, so that we take 

2 

SAX3 

-^ = 1.1157. 

Sv 



E-29 



Table of Contents 



Appendix E 



From the data we have also estimated 



'X3X4 



^i^A 



.439 

.473 for the composite estimate using zero squared bias 

.519 for the ^jmposite estimate using high squared bias 
.473 for the simple pooled estimate. 



Substituting the estimates into Equations (14) and (15) yields 





K-^ 


Px,X3 


Composite zero bias 


Composite high bias 


Simple pooled 


.639 


.688 


.755 


.688 



Thus, the composite estimate using zero squared bias and the simple pooled 
variance estimates for 1984 have the same estimated correlation with the true unit 
variance for 1983. The estimated correlation with the direct estimate is somewhat 
lower. It is somewhat higher with the composite estimate with high squared bias. 
The differences may be real or due to sampling variability. These correlations are 
about 50 percent greater than the correlation between any of these estimates for 1984 
and the direct estimate for 1983. 

We return to explain the computation of the variances of the 

composite estimators, as shown in the last column of Tables E-3 and E-4. These 

values, which we have termed "experimental," are based on the following 

2 
speculation. For economy of notation, let s. denote the variance defined by 

2 2 

Equation (5) and S/.x the variance defined by Equation (7). Let Sj denote the 

composite estimate of the unit variance for state i, i.e., 
sf =W.s?^(l-W.)S(2j . 



E-30 



Table of Contents 



We Stat, Inc. 



Conditional on the value of W^, 

Var(sf) = wf Var(sJ) + (l-Wj)2Var(S(f)) (16) 



2 2 

since Sj and s,.j are independent. We take 



and 



Var(sf) - (|3.-1) (sj)2/n-i 



2\ _ -e /^<\2\T^uJ 



Var(S(j)) = Z (ny2Var(Sj)/(n'-n'.)2. 



The experimental estimate of the variance or mean square error is given by 
substituting estimates of the quantities in Equation (16). 

The problem with direct variance estimates by states is their greater 
sampling variability, as discussed in Section 2.5 of the report and in Appendix C. 
We conclude that the sampling variability of the composite estimator is 
considerably less, as illustrated in Figure E-3. Consequently, for making estimates of 
needed sample sizes, at least, the composite estimates are likely to have substantial 
advantage over the use of the direct state variance estimates. 

With the squared biases assumed equal to zero, use of the pooled unit 
variance estimate for each state results in a mean square error of the variance 
estimates that varies from about one-sixth to one-fourteenth as large, depending on 
the size of the state and Federal samples, as the variance of the unit variance 
estimate based only on the current data for a state. This may modestly overstate the 
gaiits. The mean square errors for the estimates assuming biases show mean sqiiare 
error reductions of about half this amoimt, but these substantially understate the 
gains because the biases, by design, are substantial overestimates. Clearly, the 
improvement through pooled variance estimation is substantial for all states, but is 
of course greatest for the states with the smaller AFDC-QC samples. 



E-31 



Table of Contents 



Appendix E 



Table E-2. Data and results of regression estimates of variance, by states 





Estimated payment error rate 


Estimated unit variance 


Regression estimates 




Period 


Period 


Period 


State 


1 


2 


3 


4 


1 


2 


3 


4 


3 


4 


AK 


.1376 


.2211 


.1288 


.1104 


.04417 


.13373 


.07181 


.05147 


.06947 


.04653 


AL 


.0832 


.0709 


.0551 


.0508 


.03238 


.01607 


.01482 


.01842 


.01522 


.01380 


AR 


.0657 


.0701 


.0884 


.0521 


.01705 


.00986 


.02907 


.01493 


.02303 


.01807 


AZ 


.0874 


.0784 


.1155 


.1165 


.02107 


.02190 


.02900 


.02887 


.03421 


.03628 


CA 


.0861 


.0500 


.0736 


.0463 


.04103 


.01264 


.02717 


.01636 


.01971 


.01604 


00 


.0998 


.0652 


.0500 


.0800 


.04197 


.01983 


.01643 


.01496 


.01486 


.02273 


CT 


.0798 


.0690 


.0528 


.0748 


.00155 


.00999 


.01124 


.04580 


.01280 


.01967 


DC 


.1511 


.1198 


.1759 


.1666 


.00896 


.04274 


.05690 


.03832 


.05711 


.05820 


DE 


.1285 


.1024 


.1008 


.1357 


.06263 


.03811 


.06372 


.13296 


.03440 


.05206 


FL 


.0749 


.0836 


.0631 


.0576 


.00617 


.01429 


.01126 


.00972 


.01691 


.01459 


GA 


.0732 


.0577 


.0477 


.0549 


.01105 


.00737 


.01773 


.02082 


.01069 


.01594 


HI 


.1012 


.1008 


.0872 


.0770 


.03756 


.02881 


.03234 


.05748 


.02786 


.02609 


lA 


.0440 


.0411 


.0406 


.0490 


.00890 


.00300 


.00775 


.01111 


.00820 


.01143 


ID 


.1265 


.0507 


.0473 


.0613 


.07180 


.02711 


.01128 


.02674 


.01627 


.01591 


IL 


.0860 


.0793 


.0767 


.0883 


.02616 


.01034 


.02030 


.03339 


.01966 


.02596 


IN 


.0520 


.0323 


.0345 


.0425 


.01478 


.00311 


.00429 


.01227 


.00653 


.00863 


KS 


.0751 


.0870 


.0562 


.0008 


.00967 


.03703 


.02391 


.00004 


.02158 


.02480 


KY 


.0550 


.0443 


.0337 


.0378 


.00773 


.00396 


.00443 


.00773 


.00643 


.00729 


lA 


.0577 


.0763 


.0645 


.0604 


.01396 


.01300 


.02134 


.00727 


.01705 


.01838 


MA 


.1112 


.0735 


.0545 


.0944 


.03411 


.01689 


.00842 


.01699 


.01517 


.02444 


MD 


.1179 


.1132 


.0911 


.0733 


.01047 


.02996 


.02363 


.01769 


.02916 


.02239 


ME 


.0861 


.0716 


.0326 


.0291 


.02243 


.01710 


.01381 


.00280 


.01479 


.00788 


MI 


.0691 


.0767 


.0898 


.0814 


.01040 


.03191 


.01360 


.00946 


.02984 


.02190 


MN 


.0381 


.0499 


.0309 


.0297 


.01933 


.02637 


.01413 


.02225 


.01169 


.00783 


MO 


.0648 


.0770 


.0611 


.0343 


.01834 


.01609 


.01344 


.01141 


.01695 


.00858 


MS 


.0733 


.0649 


.0500 


.0446 


.01431 


.02044 


.01391 


.02405 


.01513 


.01182 


MT 


.0688 


.0305 


.0113 


.0384 


.02961 


.00330 


.00303 


.02612 


-.00006 


.00730 


NC 


.0619 


.0465 


.0372 


.0283 


.00839 


.00406 


.00288 


.00432 


.00684 


.00407 


hD 


.0330 


.0287 


.0128 


.0234 


.01668 


.00666 


.00284 


.00736 


.00084 


.00350 


NE 


.0410 


.0676 


.0386 


.1323 


.01849 


.04252 


.01943 


.08227 


.02417 


.03862 


m 


.0549 


.0771 


.0584 


.0387 


.04010 


.00648 


.02194 


.02364 


.01338 


.01811 


NJ 


.0836 


.0770 


.0936 


.0522 


.02134 


.02009 


.02882 


.00900 


.02741 


.01795 


NM 


.1241 


.1236 


.1189 


.0913 


.04409 


.04972 


.02936 


.02926 


.04284 


.02908 


NV 


.0250 


.0203 


.0147 


.0104 


.01310 


.00019 


.00132 


.00941 


-.00041 


-.00119 


NY 


.0912 


.0694 


.0681 


.0913 


.01118 


.01816 


.01033 


.02338 


.01936 


.02407 


CH 


.0838 


.0933 


.0769 


.0733 


.02362 


.02783 


.01982 


.03070 


.02474 


.02204 


CK 


.0492 


.0829 


.0463 


.0286 


.03647 


.03942 


.01663 


.01143 


.01962 


.00800 


CR 


.0670 


.0685 


.0734 


.0679 


.01669 


.05963 


.04394 


.02411 


.03336 


.02771 


PA 


.0979 


.0830 


.0937 


.0762 


.01062 


.01364 


.03423 


.01128 


.02344 


.02642 


RI 


.0676 


.0373 


.0384 


.0348 


.02607 


.01144 


.02007 


.02837 


.01498 


.01631 


SC 


.0739 


.0828 


.0937 


.0839 


.01371 


.00972 


.02264 


.01340 


.02437 


.02322 


SD 


.0721 


.0208 


.0376 


.0363 


.06411 


.00378 


.01002 


.01380 


.01001 


.00860 


TN 


.1019 


.0771 


.0337 


.0427 


.01231 


.02033 


.01323 


.00928 


.01639 


.01157 


TX 


.0711 


.0791 


.0881 


.0790 


.02880 


.01393 


.02411 


.02163 


.02463 


.02431 


ur 


.0598 


.0371 


.0343 


.0437 


.03343 


.01037 


.01937 


.01897 


.01373 


.01385 


VA 


.0369 


.0349 


.0330 


.0481 


.00867 


.00470 


.00238 


.01301 


.00600 


.00968 


vr 


.0382 


.0646 


.0366 


.0327 


.01421 


.03737 


.00749 


.01362 


.02212 


.00645 


WA 


.0985 


.0868 


.0731 


.0560 


.07344 


.02723 


.02435 


.00640 


.02348 


.01788 


WI 


.0942 


.0771 


.0489 


.0489 


.02133 


.01907 


.01607 


.01607 


.01423 


.01366 


wv 


.1894 


.0762 


.0811 


.0838 


.10833 


.01831 


.01319 


.02310 


.02302 


.02314 


WY 


.0709 


.0836 


.0385 


.0563 


.03275 


.03739 


.02020 


.04434 


.01671 


.01707 



E-32 



Table of Contents 



Y/eatat, Inc. 



Table E-3. Composite 


estimates of unit ^ 


krahance 


, using zero squared bias, by states 
















Unit 


variance 


Variance of: 


Groi^ 














Variance 
















Group 


Local 


average 


of 


State 


n' 


beta 


f 


Weight 


Direct 


Composite 


average 


variance 


variance 


composite 


^D 


144 


39 


.456 


.066 


.0268 


.0146 


4.028E-06 


5.729E-05 


.0137 


3.764E-06 


NV 


151 


39 


.462 


.079 


.0146 


.0146 


4.599E-06 


5.336E-05 


.0146 


4.235E-06 


NC 


368 


56 


.147 


.158 


.0070 


.0074 


1.550E-06 


8.288E-06 


.0075 


1.306E-06 


lA 


344 


51 


.239 


.151 


.0077 


.0095 


2.368E-06 


1.327E-05 


.0098 


2.009E-06 


vr 


145 


38 


.482 


.076 


.0185 


.0151 


4.767E-06 


5.770E-05 


.0148 


4.403E-06 


KY 


360 


56 


.156 


.156 


.0060 


.0076 


1.666E-06 


8.992E-06 


.0079 


1.406E-06 


VA 


364 


55 


.162 


.155 


.0075 


.0078 


1.643E-06 


8.978E-06 


.0078 


1.389E-06 


IN 


377 


56 


.161 


.165 


.0034 


.0077 


1.833E-06 


9.258E-06 


.0085 


1.530E-06 


NH 


147 


39 


.473 


.059 


.0335 


.0148 


3.889E-06 


6.214E-05 


.0137 


3.660E-06 


ur 


177 


37 


.500 


.095 


.0191 


.0155 


5.096E-06 


4.860E-05 


.0152 


4.612E-06 


MT 


150 


38 


.479 


.066 


.0350 


.0244 


1.035E-05 


1.455E-04 


.0236 


9.659E-06 


ME 


219 


46 


.335 


.083 


.0197 


.0189 


6.669E-06 


7.358E-05 


.0189 


6.115E-06 


FL 


360 


56 


.153 


.102 


.0167 


.0122 


2.658E-06 


2.337E-05 


.0117 


2.386E-06 


AR 


241 


51 


.252 


.085 


.0113 


.0158 


4.899E-06 


5.258E-05 


.0162 


4.481E-06 


KS 


257 


48 


.298 


.087 


.0243 


.0176 


5.460E-06 


5.705E-05 


.0170 


4.983E-06 


SD 


151 


39 


.456 


.069 


.0124 


.0231 


1.021E-05 


1.384E-04 


.0239 


9.512E-06 


lA 


373 


56 


.154 


.115 


.0137 


.0123 


2.897E-06 


2.232E-05 


.0121 


2.564E-06 


GA 


361 


56 


.146 


.110 


.0140 


.0120 


2.729E-06 


2.210E-05 


.0118 


2.429E-06 


cr 


358 


53 


.211 


.125 


.0074 


.0143 


4.400E-06 


3.092E-05 


.0152 


3.852E-06 


MO 


405 


56 


.149 


.131 


.0112 


.0121 


3.008E-06 


1.999E-05 


.0123 


2.615E-06 


TN 


366 


56 


.159 


.120 


.0095 


.0125 


3.236E-06 


2.362E-05 


.0129 


2.846E-06 


RI 


219 


44 


.369 


.106 


.0172 


.0217 


1.115E-05 


9.444E-05 


.0222 


9.975E-06 


SC 


363 


54 


.194 


.154 


.0099 


.0153 


6.529E-06 


3.587E-05 


.0162 


5.524E-06 


NY 


357 


56 


.148 


.118 


.0239 


.0140 


4.251E-06 


3.163E-05 


.0127 


3.747E-06 


CO 


288 


48 


.299 


.130 


.0091 


.0189 


9.381E-06 


6.268E-05 


.0203 


8.160E-06 


MI 


364 


56 


.150 


.151 


.0129 


.0139 


5.219E-06 


2.945E-05 


.0141 


4.433E-06 


PA 


365 


56 


.148 


.106 


.0273 


.0138 


3.830E-06 


3.237E-05 


.0122 


3.425E-06 


WI 


372 


56 


.149 


.143 


.0182 


.0140 


4.814E-06 


2.892E-05 


.0134 


4.127E-06 


AZ 


258 


49 


.286 


.092 


.0359 


.0192 


7.216E-06 


7.094E-0S 


.0175 


6.549E-06 


MS 


361 


55 


.176 


.149 


.0036 


.0144 


6.229E-06 


3.553E-05 


.0163 


5.300E-06 


MN 


366 


54 


.192 


.152 


.0038 


.0149 


6.673E-06 


3.713E-05 


.0169 


5.657E-06 


MA 


366 


56 


.149 


.127 


.0184 


.0181 


7.176E-06 


4.924E-05 


.0181 


6.263E-06 


NJ 


362 


56 


.149 


.130 


.0148 


.0180 


7.468E-06 


4.994E.03 


.0185 


6.497E-06 


AL 


367 


55 


.179 


.116 


.0255 


.0191 


7.184E-06 


5.452E-03 


.0182 


6.348E-06 


WV 


298 


51 


.239 


.115 


.0126 


.0209 


9.814E-06 


7.573E-05 


.0220 


8.688E-06 


CK 


278 


50 


.261 


.107 


.0067 


.0217 


1.075E-05 


8.9S8E-05 


.0235 


9.596E-06 


ID 


156 


37 


.493 


.069 


.0540 


.0300 


1.345E-05 


2.091E-04 


.0283 


1.439E-05 


MD 


363 


56 


.150 


.132 


.0130 


.0181 


7.631E-06 


5.043B-03 


.0188 


6.643E-06 


WY 


164 


39 


.471 


.076 


.0334 


.0289 


1.574E-05 


1.922E-04 


.0285 


1.455E-05 


CA 


387 


S6 


.151 


.120 


.0243 


.0181 


6.S34E-06 


4.773E-05 


.0173 


5.747E-06 


TX 


363 


56 


.149 


.122 


.0208 


.0181 


6.913E-06 


4.979E-03 


.0177 


6.070E-06 


EL 


382 


56 


.152 


.116 


.0211 


.0151 


4.539E-06 


3.44SE.0S 


.0143 


4.010E-06 


NM 


230 


46 


.337 


.103 


.0141 


.0228 


1.201E-0S 


1.048E-04 


.0238 


1.078E-05 


GH 


368 


56 


.151 


.144 


.0083 


.0152 


6.041E-06 


3.S98E-0S 


.0164 


5.173E-06 


NB 


199 


43 


.397 


.089 


.0316 


.0256 


1.337E-05 


1.376E-04 


.0250 


1.218E-05 


DC 


240 


48 


.297 


.095 


.0261 


.0213 


9.391E-06 


8.929E0S 


.0208 


8.497B-06 


U 


211 


45 


.349 


.087 


.0316 


.0235 


1.112E-05 


1.162E-04 


.0228 


1.015E-05 


WA 


389 


34 


.182 


.135 


.0110 


.0165 


6.999E.06 


3.804E-0S 


.0175 


5.9nE-06 


OR 


280 


50 


.264 


.116 


.0147 


.0199 


9.24SE-06 


7.031E-05 


.0206 


8.171E-06 


AK 


160 


38 


.479 


.082 


.0327 


.0290 


1.734E-03 


1.930E-04 


.0286 


1.591E-05 


EE 


164 


36 


.524 


.083 


.0433 


.0311 


1.897E-03 


2.046E04 


.0298 


1.736E-05 



Average 



.0185 .0173 6.999E-06 6.448E-05 
E-33 



.0173 6.266E-06 



Table of Contents 



Appendix E 



Table E-4. Composite 


estimates of imit 


variance, using high estimate o( average squared bias, by states 












Unit 


variance 


Variance of: 


Group 














Variance 
















Group 


Local 


average 


of 


State 


n' 


beta 


f 


Weight 


Direct 


Composite 


average 


variance 


variance 


composite 


>D 


144 


39 


.456 


.387 


.0268 


.0188 


4.028E-06 


5.729E-05 


.0137 


1.008E-05 


NV 


151 


39 


.462 


.407 


.0146 


.0146 


4.599E-06 


5.336E-05 


.0146 


1.047E-05 


NC 


368 


56 


.147 


.802 


.0070 


.0071 


1.550E-06 


8.288E-06 


.0075 


5.396E-06 


lA 


344 


51 


.239 


.722 


.0077 


.0083 


2.368 E-06 


1.327E-05 


.0098 


7.099E-06 


vr 


145 


38 


.482 


.390 


.0185 


.0162 


4.767E-06 


5.770E-05 


.0148 


1.054E-05 


KY 


360 


56 


.156 


.790 


.0060 


.0064 


1.666E-06 


8.992E-06 


.0079 


5.680E-06 


VA 


364 


55 


.162 


.790 


.0075 


.0076 


1.643 E-06 


8.978E-06 


.0078 


5.673E-06 


IN 


377 


56 


.161 


.786 


.0034 


.0045 


1.833E-06 


9.258E-06 


.0085 


5.798E-06 


m 


147 


39 


.473 


.367 


.0335 


.0209 


3.889E-06 


6.214E-05 


.0137 


9.915E-06 


ur 


177 


37 


.500 


.433 


.0191 


.0169 


5.096E-06 


4.860E-05 


.0152 


1.077E-05 


NTT 


150 


38 


.479 


.226 


.0350 


.0262 


1.035E-05 


1.455E-04 


.0236 


1.362E-05 


ME 


219 


46 


.335 


.345 


.0197 


.0192 


6.669E-06 


7.358E-05 


.0189 


1.162E-05 


FL 


360 


56 


.153 


.598 


.0167 


.0147 


2.658E-06 


2.337E-05 


.0117 


8.784E-06 


AR 


241 


51 


.252 


.413 


.0113 


.0142 


4.899E-06 


5.258E-05 


.0162 


1.065E-05 


KS 


257 


48 


.298 


.397 


.0243 


.0199 


S.460B-06 


5.705E-05 


.0170 


1.097E-05 


SD 


151 


39 


.456 


.234 


.0124 


.0212 


1.021E-05 


1.384B-04 


.0239 


1.358E-05 


L\ 


373 


56 


.154 


.610 


.0137 


.0131 


2.897E-06 


2.232E-05 


.0121 


8.759E-06 


CSA 


361 


56 


.146 


.612 


.0140 


.0131 


2.729E-06 


2.210E-05 


.0118 


8.681E-06 


CT 


358 


53 


.211 


.541 


.0074 


.0110 


4.400E-06 


3.092E-05 


.0152 


9.985E-06 


MO 


405 


56 


.149 


.637 


.0112 


.0116 


3.008E-06 


1.999E-05 


.0123 


8.511E-06 


TN 


366 


56 


.159 


.599 


.0095 


.0109 


3.236E-06 


2.362E-05 


.0129 


9.003E-06 


RI 


219 


44 


.369 


.314 


.0172 


.0207 


1.115E.05 


9.444E-0S 


.0222 


1.456E-05 


SC 


363 


54 


.194 


.518 


.0099 


.0129 


6.529E-06 


3.587E-05 


.0162 


1.116E-05 


NY 


357 


56 


.148 


.535 


.0239 


.0187 


4.251 E-06 


3.163E-05 


.0127 


9.962E-06 


CD 


288 


48 


.299 


.398 


.0091 


.0159 


9.381B-06 


6.268E-05 


.0203 


1.334E-05 


MI 


364 


56 


.150 


.559 


.0129 


.0134 


5.219E-06 


2.945E-05 


.0141 


1.021E-05 


PA 


365 


56 


.148 


.526 


.0273 


.0201 


3.830E-06 


3.237E-05 


.0122 


9.816E-06 


WI 


372 


56 


.149 


.561 


.0182 


.0161 


4.814E-06 


2.892E-05 


.0134 


1.002E-05 


AZ 


258 


49 


.286 


.357 


.0359 


.0240 


7.216E-06 


7.094E-05 


.0175 


1.200E-05 


MS 


361 


55 


.176 


.519 


.0036 


.0097 


6.229E-06 


3.555E-05 


.0163 


l.lOlE-05 


MN 


366 


54 


.192 


.511 


.0038 


.0102 


6.673E-06 


3.713E-05 


.0169 


1.128E-05 


MA 


366 


56 


.149 


.444 


.0184 


.0182 


7.176E-06 


4.924E-05 


.0181 


1.191E-05 


NJ 


362 


56 


.149 


.442 


.0148 


.0169 


7.468E-06 


4.994E-05 


.0185 


1.208E-05 


AL 


367 


55 


.179 


.419 


.0255 


.0213 


7.184E.06 


5.452E-05 


.0182 


1.199E-05 


WV 


298 


51 


.239 


.356 


.0126 


.0186 


9.814E-06 


7.573E-05 


.0220 


1.368E-05 


CK 


278 


50 


.268 


.324 


.0067 


.0181 


1.07SE.05 


8.9S8E-OS 


.0235 


1.429E-05 


ID 


156 


37 


.495 


.185 


.0540 


.0330 


1.545E-05 


2.091 E-04 


.0283 


l,743E-05 


MD 


363 


S6 


.150 


.441 


.0130 


.0163 


7.651E-06 


5.043E-0S 


.0188 


1.219E-05 


WY 


164 


39 


.471 


.199 


.0334 


.0295 


1.574E-05 


1.922E-04 


.0285 


1.772E-05 


CA 


387 


56 


.151 


.447 


.0245 


.0205 


6.534E.06 


4.773E-05 


.0173 


1.154E-05 


TX 


363 


56 


.149 


.439 


.0208 


.0191 


6.913E-06 


4.979E-05 


.0177 


1.178E-05 


IL 


382 


56 


.152 


.515 


.0211 


.0178 


4.539E-06 


3.445E-05 


.0143 


1.021E-05 


NM 


230 


46 


.337 


.296 


.0141 


.0210 


1.201E-05 


1.048E-04 


.0238 


1.514E-05 


CH 


368 


56 


.151 


.514 


.0083 


.0122 


6.041E^ 


3.598E-05 


.0164 


1.095E-05 


>£ 


199 


43 


.397 


.248 


.0316 


.0266 


1.337E-05 


1.376E.04 


.0250 


1.604E-0S 


DC 


240 


48 


.297 


.317 


.0261 


.0225 


9.391E-06 


8.929E-05 


.0208 


1.336E-05 


HI 


211 


45 


.349 


.271 


.0316 


.0252 


1.112E-05 


1.162E-04 


.0228 


1.444E-0S 


WA 


389 


54 


.182 


.507 


.0110 


.0142 


6.999E-06 


3.804E-05 


.0175 


1.147E-05 


(X 


280 


50 


.264 


.370 


.0147 


.0184 


9.245E-06 


7.031E-05 


.0206 


1.330E-05 


AK 


160 


38 


.479 


.204 


.0327 


.0295 


1.734E-05 


1.930E-04 


.0286 


1.902E-05 


CE 


164 


36 


.524 


.200 


.0453 


.0329 


1.897E-05 


2.046E-04 


.0298 


2.031E-05 



Average 



.0185 



.0174 
E-34 



6.999E-06 6.448E-05 



.0173 1.153E-05 



Table of Contents 



Weatat, Inc. 



Table E-5. Pooled unit variance estimates, by states 









Pooled 


State 


n' 


f 


unit variance 


ND 


144 


0.456 


0.01708 


NV 


151 


0.462 


0.01724 


NC 


368 


0.147 


0.00885 


lA 


344 


0.239 


0.01130 


vr 


145 


0.482 


0.01777 


KY 


360 


0.156 


0.00908 


VA 


364 


0.162 


0.00924 


IN 


377 


0.161 


0.00922 


NH 


147 


0.473 


0.01753 


ur 


177 


0.500 


0.01826 


MT 


150 


0.479 


0.02550 


ME 


219 


0.335 


0.02008 


FL 


360 


0.153 


0.01320 


AR 


241 


0.252 


0.01692 


KS 


257 


0.298 


0.01866 


SD 


151 


0.456 


0.02463 


lA 


373 


0.154 


0.01323 


GA 


361 


0.146 


0.01295 


cr 


358 


0.211 


0.01539 


MO 


405 


0.149 


0.01306 


TN 


366 


0.159 


0.01343 


RI 


219 


0.369 


0.02686 


SC 


363 


0.194 


0.01907 


NY 


357 


0.148 


0.01705 


00 


288 


0.299 


0.02372 


MI 


364 


0.150 


0.01712 


PA 


365 


0.148 


0.01702 


WI 


372 


0.149 


0.01706 


AZ 


258 


0.286 


0.02317 


MS 


361 


0.176 


0.01829 


MN 


366 


0.192 


0.01899 


MA 


366 


0.149 


0.02251 


NJ 


362 


0.149 


0.02250 


AL 


367 


0.179 


0.02368 


WV 


298 


0.239 


0.02598 


CK 


278 


0.268 


0.02710 


ID 


156 


0.495 


0.03592 


MD 


363 


0.150 


0.02256 


WY 


164 


0.471 


0.03499 


CA 


387 


0.151 


0.02259 


TX 


363 


0.149 


0.022S1 


IL 


382 


0.132 


0.01794 


NM 


230 


0.337 


0.02543 


GH 


368 


0.151 


0.01792 


m 


199 


0.397 


0.02785 


DC 


240 


0.297 


0.02379 


HI 


211 


0.349 


0.02592 


WA 


389 


0.182 


0.01917 


OR 


280 


0.264 


0.02249 


AK 


160 


0.479 


0.03115 


EE 


164 


0.524 


0.03296 



E-35 



Table of Contents 



Appendix E 



Figure E-5. Weights for the composite estimate using zero and high squared bias 



Weight using 
high squared bias 











^ 


8- 




p= 775 




6- 


, 




. • - . • 




• -V^-i— -^ 


— -»^ • 


4 - 


• ^_^ Tm • •• 




0.2 - 






0- 


1 1 1 1 1 





04 06 08 !0 12 14 16 
Weight using zero squared bias 



8 



Figure E-6. Relationship of ttte simple pooled estimate and the composite estimate using zero squared bias 



Simple 40 
fx>oted 
estimate 
of unit 



variance 



30-- 



"■ 










y- 






















^y^ ■ 


-• 


p= 978 














^ 


• 




y 


t 




1 


1 1 



10 20 30 

Cbmpoflite estimate of unit variance using zero squared bias 



40 



E-36 



Table of Contents 



Yitstat, Inc. 



Figure E-7. Relationship of various estimates of unit variance for 1984 to the direct estimate for 1983 (x Kp) 



Direct 
unit 

variance, 
FY 1984 




20 30 40 50 

Esiimat* of unit vananc*, FY 1983 



90 



Composite 49 
esimate of 
unit van<nce 
using zero 
squared bias , 
FY 1984 30 




10 



20 30 40 50 

Estimate of unit variance, FY 1 983 



60 



70 



E-37 



Table of Contents 



Appendix E 



Figtire E-7. Relationship of various estimates of unit variance for 1984 to the direct estimate for 1983 (x 10^) 
(continued) 



Comoosite "^O 
pstimjte of 
unit vari^nc? 
using high 
squared 

bias, 30 4 

FY 1984 




20 30 

Estim^t* of unit van^nc*. 



50 
FY 1983 



60 



70 



80 



90 



Simole pooled 
estimate of 
unit variance, 
rY1984 



tu -' 






• 


■ 


a 




^^ 






p= 473 




30- 


• 








a 

a a 

a a 

a 

a a 
■ "a m ^ 




a 


a 




20- 


a * 1^'^"'^ • 

-„,-— T • . , a 
^ — •'^ • « a 

^^~ m 

• a • a 


■ 


■ 






10- 
0- 


a 
■ a ■ 

1 1 , 


-\ h 


1- 


1 


1 



10 20 30 40 30 

Estimate of unit variance, FY 1 983 



60 



70 



80 



90 



E-38 



Table of Contents 

■••- 



APPENDIX F 
OPTIMUM SAMPLE SIZE FOR DISALLOWANCES BASED ON POINT ESTIMATES 



For the purpose of this appendix, we may define the optimum sample 
size as that which minimizes the cost. But the cost can be defined in more than one 
way. We shall define the expected cost from the Federal point of view as the Federal 
share of the cost of review of the state sample plus the cost of review and processing 
the Federal subsample minus the expected value of the disallowance assessed. We 
shall define the expected cost from a state's point of view as its cost of processing the 
state sample plus the expected value of the disallowance assessed. 

Let us denote 

U = the Federal contribution for the time period; 

k = proportion of the cost that is borne by the state; 

n = size of the state sample; 

n' = size of the Federal subsample; 

Cq = state share of the state cost per case in the state sample; 

C| = Federal share of the state cost per case in the state sample; 

c, = Federal cost per case in the Federal subszmiple; 

r =s estimated payment error rate; and 

R = E(r), the expectation of r. 

We consider, first, the problem of minimizing the variance (thus 
maximizing the precision) of the estimated payment error rate, for a fixed Federal 
cost K defined by c^n + CjH'. The minimizing values of n and n' are obtained by 

setting equal to zero the partial derivatives of the function 

F-l 



Table of Contents 



I&. 



Appendix F 



V = Oj. - X(K - c^n - C2n') 

and solving the resulting equations for X,, n and n'. This gives the optin\um 
subsampling fraction f = n'/n as 

f2 = {(l-p2)/p2){c,/c2^ 

The optimizing sample sizes are then 

n = K/(c^ + fc2) 

n' = fn. 

Present plans call for annual samples of n=2400 and n'=360 in large 
states. It has been estimated that C|=$130 and C2=$330, which gives rise to the value 
K=$430,800. The values that would minimize the variance for that cost would be 
n=1667 and n'=649. 

We now suppose that a portion of the Federal contribution U to a state 
is withheld when the point estimate of the payment error rate, r, exceeds .03, and 
that then the disallowance is the fraction of the Federal contribution equal to the 
excess of r over the tolerance level .03. Let 

^ = (r-.03) 

^i = E(^ = (R-.03)U 

o2 = <^' = u2aj. 



The disallowance is deHned by 



D = { 0, 



ifr>.03 
otherwise. 



F-2 



Table of Contents 



1^ 



Westat, Inc. 



It can be shown (see note at the end of this Appendix) that, since ^ is approximately 
normally distributed, the expected value of the disallowance is approximately 

E(D) = (a/V270 exp (-^V2a2) + (^/^^ r~ exp (-t^/2) dt. 

-H/a 

This expression can be evaluated, given the values of o (which is a fvmction of n, n', 
and certain other parameters) and of n (which is a fimction of R and U). 

The expected value of the gain to the Federal government is 

G = E(D) - c^n - C2n' . 

2 
We pose the question: given that it is required to attain a variance o^ 

of the estimated payment error rate, is it possible to choose a state sample size n that 
maximizes G? We have 

aj = {o2/T2n'}{l.(l-n7n)p2}; 

2 
o^ = variance of the payment error finding by the Federal review; 

T = average AFDC payment^; and 

p = correlation between the Federal and state findings. 

2 
To attain a given variance Oj. given the sample size n, we must have 

n- = (l-p2)/(T2oJ/a2.p2/n). 



Since n' ^ n, we must satisfy the inequality 

^We have used T (which is a constant) in the estimate rather than the estimate from the state sample, 
in order to simplify this analysis. 

F-3 



Table of Contents 



1^. 



Appendix F 



n > ol/r^cl 



Thus, for example, if the standard error is to be at most Or=.01, and if the ratio 

ax/T=.2, then the state sample size must be at least n=400. The Federal subsample 
size would then have to be n'=289 if p=.85. If n were increased to 2400, the desired 
standard error would be attained with n'=127. 

For values of n satisfying the conditions stated above, we now examine 
the properties of G as a function of n for a fixed value of o, . We have 

dG/dn = -c^-Cjdn'/dn 

= -c^ + C2(l-p2)p2/(nT2aJ/a^-p2)2 . 

Table F-1 gives the values of this derivative for C|=130, C2=330 and several values of 
the other parameters. An entry of zero in the table indicates that the specified 
standard error cannot be attained with the associated value of n. The table shows 
that once the state saunple is of sufficient size to yield the desired standard error, 
increasing the size of the Federal subsample will only reduce the expected value of 
the Federal gain. 

We now also examine the effect on the expected value of the Federal 
gain that would result from varying the desired standard error. The derivative of 
E(D) with respect to a is 



(1/V^exp (^ /2o2) 



which is always positive. For a fixed n, we have dD/dn'=(dD/do)(da/dn'). Since 
da/dn' is clearly negative, so is dD/dn'. Thus, the expected Federal gain is a 
decreasing function of the Federal sample size, for any given size of the state sample. 
It follows that to maximize the expected value of the Federal gain, given the state 
sample, the Federal sample should be as small as possible. Similarly, to maximize 



F-4 



Table of Contents 



Weatat, Inc. 



the Federal gain, given the size of the Federal subsample, the state sample should be 
as small as possible. We conclude that from the point of view of maximizing the 
expected value of the Federal gain, there is no optimum choice of the sample sizes. 



F-5 



Table of Contents 



1^ 



Appendix F 



TECHNICAL NOTE FOR APPENDIX F 

Theorem: Let ^ be normally distributed with mean \i and variance o^, 
and let D be the random variable defined by 



^ = \ 0, if^^O. 
Then the mathematical expectation of D is 

ED = (0/V27O exp (-^2/2<j2) + (n/^^ J ~ exp (-t^/l) 



Proof: 

ED = Prob(4^0)xO + Prob(^>0)xE(5 I ^>0) 

= (1 /ylliTa) r~ X exp [-ix-\i^ /2a^)} dx. 


Under the transformation t=(x-n)/o we get 

ED = (1/V^ f" (at + n) exp (-t2/2) dt 

« (a/^JTJij r- texp(-t2/2)dt + (H/V2J0 f" exp(-t2/2)dt 

= (a/V2jO exp(-^i2/2o2) + (^/^^ f- exp(-t2/2) dt 

-H/a 

which was to be proved. 



F-6 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size 



rho= 



0.85 



T-bar' 



300 



S(x)> 



60 



Standard error of the estimated payment error rate 
0.005 0.01 0.015 002 



0.025 



100 











-129.7616 


-129.9212 


200 








-129.8336 


-129.9402 


-129.9723 


300 








-129.9314 


-129.9709 


-129.9833 


4O0 





-129.7616 


-129.9367 


-129.9798 


-129.988 


500 





-129.8746 


-129.9683 


-129.9845 


-129.9907 


600 





-1299149 


-129.9751 


-129.9873 


-129.9924 


700 





-129.9356 


-129.9794 


-129.9895 


-129 9935 


800 





-129.9462 


-129.9825 


-129.9909 


-129.9944 


900 





-129.9567 


-129.9848 


-129.992 


-129 995 


1000 





-129.9628 


-129.9863 


-129.9929 


-129.9956 


1100 





-129.9674 


-129.9879 


-129.9936 


-129 996 


1200 





-129.9709 


-129.989 


-129.9941 


-129.9963 


1300 





-129.9738 


-129.99 


-129.9946 


-129.9966 


1400 





-129.9762 


-129.9907 


-129.995 


-129.9969 


1500 





-129.9781 


-129.9914 


-129.9954 


-129.9971 


1600 


-129.7616 


-129.9798 


-129.992 


-129.9957 


-129.9973 


1700 


-129.8054 


-129.9812 


-129.9925 


-129.9959 


-129.9974 


1800 


-129.8356 


-129.9825 


-129.993 


-129.9962 


-129.9976 


1900 


-129.8577 


-129.9836 


-129.9934 


-129.9964 


-129.9977 


2000 


-129.8746 


-129.9843 


-129.9937 


-129.9966 


-129.9978 


2100 


-129.8879 


-129.9854 


-129.994 


-129.9967 


-129.9979 


2200 


-129.8986 


-129.9862 


-129.9943 


-129.9969 


-129.998 


2300 


-129.9075 


-129.9868 


-129.9946 


-129.997 


-129.9981 


2400 


-129.9149 


-129.9875 


-129.9948 


-129.9972 


-129.9982 


2500 


-129.9212 


-129.968 


-129.995 


-129.9973 


-129.9983 


2600 


-129.9267 


-129.9883 


-129.9952 


-129.9974 


-129.9983 


2700 


-129.9314 


-129.989 


-129.9954 


-129.9975 


-129.9984 


2900 


-129.9356 


-129.9895 


-129.9956 


-129.9976 


-129.9985 


2900 


-129.9393 


-129.9899 


-129.9958 


-129.9977 


-129.9985 


3000 


-129.9426 


-129.9902 


-I29.99S9 


-129.9977 


-129.9986 



F-7 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho' 



0.85 



T-b8r= 



300 



S(x)= 



40 



Standard error of the estimated payment error rate 
0.005 001 0.015 002 



025 



100 








-129 8782 


-129 9567 


-129.9763 


200 





-129.8336 


-129.9634 


-129 9823 


-129.9093 


300 





-129.9314 


-129.9785 


-129.989 


-129.9933 


400 





-129.9567 


-129.9848 


-129.992 


-129.993 


500 





-129.9683 


-129.9882 


-129.9937 


-129.9961 


000 





-129.9731 


-129.9904 


-129.9948 


-129.9960 


700 





-129.9794 


-129.9919 


-129.9956 


-129.9972 


300 


-129.8336 


-129.9825 


-129.993 


-129.9962 


-129.9976 


900 


-129.8782 


-129.9848 


-129,9938 


-129.9966 


-129.9979 


1000 


-129.9032 


-129.9865 


-129.9943 


-129.997 


-129.9981 


1100 


-129.9197 


-129.9879 


-129.995 


-129.9972 


-129.9983 


1200 


-129.9314 


-129.989 


-129.9954 


-129.9975 


-129.9984 


1300 


-129.9402 


-129.99 


-129.9958 


-129.9977 


-129.9985 


1400 


-129.9469 


-129 9907 


-129.9961 


-129.9979 


-129.9986 


1500 


-129.9523 


-129.9914 


-129.9964 


-129.998 


-129.9987 


1600 


-129.9567 


-129.992 


-129.9966 


-129.9981 


-129.9988 


1700 


-129.9603 


-129.9925 


-129.9968 


-129 9982 


-129.9989 


I BOO 


-129.9634 


-129.993 


-129.997 


-129.9983 


-129.9989 


1900 


-129.9661 


-129.9934 


-129.9972 


-129.9984 


-129 999 


2000 


-129.9683 


-129.9937 


-129.9973 


-129.9985 


-129 999 


2100 


-129.9703 


-129.994 


-129.9974 


-129.9986 


-129.9991 


2200 


-129.9721 


-129.9943 


-129.9976 


-129.9986 


-129.9991 


2300 


-129.9737 


-129.9946 


-129.9977 


-129.9987 


-129.9992 


2400 


-129.9731 


-129.9948 


-129.9978 


-129.9988 


-129.9992 


2500 


-129.9763 


-129.995 


-129.9979 


-129.9988 


-129.9992 


2600 


-129.9774 


-129.9952 


-129.9979 


-129.9989 


-129.9993 


2700 


-129.9785 


-129.9954 


-129.998 


-129.9989 


-129.9993 


2800 


-129.9794 


-129.9956 


-129.9981 


-129.9989 


-129.9993 


2900 


-129.9803 


-129 9958 


-129 9962 


-129.999 


-129.9993 


3000 


-129.9811 


-129 9959 


-129.9982 


-129.999 


-129.9994 



F-8 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho» 



0.85 



T-bar' 



300 



S(x)« 



100 



Standard error of the estimated payment error rate 
00O5 001 0.015 0.02 



0.025 



100 

















200 














-129.0336 


300 











-129.8149 


-129.9314 


400 











-129.9078 


-129.9567 


500 








-129.7719 


-129.9386 


-129.9683 


600 








-129.8657 


-129.954 


-129.9751 


700 








-129.9048 


-129 9632 


-129.9794 


800 








-129.9263 


-129.9693 


-129.9825 


900 








-129.9399 


-129.9737 


-129.9848 


1000 








-129.9492 


-129.977 


-129.9865 


MOO 








-129.956 


-129.9796 


-129.9879 


1200 





-129.8149 


-129.9613 


-129.9816 


-129.989 


1300 





-129.8521 


-129.9654 


-129.9833 


-129.99 


1400 





-129.8769 


-129 9687 


-129.9847 


-129.9907 


1500 





-129.8946 


-129.9714 


-129.9859 


-129.9914 


1600 





-129.9078 


-129.9737 


-129.9869 


-129.992 


1700 





-129.9181 


-129.9757 


-129.9877 


-129.9925 


1800 





-129.9263 


-129.9774 


-129.9885 


-129.993 


1900 





-129.933 


-129.9788 


-129.9892 


-129.9934 


2000 





-129.9386 


-129.9801 


-129.9898 


-129.9937 


2100 





-129.9433 


-129.9813 


-129.9903 


-129.994 


2200 





-129.9474 


-129.9823 


-129.9908 


-129.9943 


2300 





-129.9509 


-129.9832 


-129.9912 


-129.9946 


2400 





-129.954 


-129.984 


-129.9916 


-129.9948 


2500 





-129.9567 


-129.9848 


-129.992 


-129.995 


2600 





-129.9591 


-129.9834 


-129.9923 


-129.9932 


2700 





-129.9613 


-129.9861 


-129.9926 


-129.9954 


2800 





-129.9632 


-129.9866 


-129.9929 


-129.9956 


2900 





-129.9649 


-129.9672 


-129.9932 


-129.9958 


3000 





-129.9665 


-129.9876 


-129.9934 


-129.9959 



F-9 



Table of Contents 



Table F-1 . Slope of the Federal gain function of the state sample size (continued) 



rho« 



0.9 



T-bar» 



300 



S(x)- 



Slandard error of the estimated payment error rate 
0.005 0.01 0.015 0.02 



60 



0.025 



too 











-129.7327 


-129.9325 


200 








-129.0388 


-129.9573 


-129.9701 


300 








-129.9421 


-129.9768 


-129.9869 


400 





-129.7327 


-129.9647 


-129.9841 


-129.9907 


500 





-129.8846 


-129.9746 


-129.9879 


-129.9927 


600 





-129.9264 


-129.9802 


-129.9902 


-129.9941 


700 





-129.946 


-129.9838 


-129.9918 


-129.995 


aoo 





-129,9573 


-129.9862 


-129.9929 


-129.9957 


900 





-129.9647 


-129.9881 


-129.9938 


-129.9962 


1000 





-129.9699 


-129.9895 


-129.9945 


-129.9966 


1100 





-129 9738 


-129.9906 


-129.995 


-129.9969 


1200 





-129 9768 


-129.9915 


-129.9955 


-129.9972 


1300 





-129.9792 


-129.9922 


-129.9958 


-129.9974 


1400 





-129.9811 


-129.9928 


-129.9961 


-129.9976 


1500 





-129.9827 


-129.9933 


-129.9964 


-129.9978 


1600 


-129.7327 


-129.9841 


-129.9938 


-129.9967 


-129.9979 


1700 


-129.7989 


-129.9852 


-129.9942 


-129.9969 


-129.998 


1800 


-129.8388 


-129.9862 


-129.9943 


-129.997 


-129.9981 


1900 


-129.8655 


-129.9871 


-129.9949 


-129.9972 


-129 9982 


2000 


-129.0846 


-129.9879 


-129.9931 


-129.9974 


-129.9983 


2100 


-129.8989 


-129.988ft 


-129.9954 


-129.9975 


-129.9984 


2200 


-129.9101 


-129.9892 


-129.9936 


-129.9976 


-129.9985 


2300 


-129.9191 


-129.9897 


-129.9950 


-129.9977 


-129.9986 


2400 


-129.9264 


-129.9902 


-129.996 


-129.9970 


-129.9986 


2500 


-129.9325 


-129.9907 


-129.9962 


-129.9979 


-129.9967 


2600 


-129.9377 


-129.9911 


-129.9963 


-129.998 


-129.9987 


2700 


-129.9421 


-129,9915 


-129.9965 


-129.9981 


-129.9988 


2800 


-129.946 


-129.9918 


-129.9966 


-129.9981 


-129.9988 


2900 


-129.9493 


-129.9921 


-129.9967 


-129.9982 


-129.9989 


3000 


-129.9523 


-129.9924 


-129.9968 


-129.9983 


-129.9989 



F-10 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho= 



0,9 



T-bar" 



300 



S(x)» 



40 



Standard error of the esUtnat«d paynrwnt wror rate 
0.0O5 0.01 0.015 0.02 



too 

200 

300 

400 

500 

600 

700 

800 

900 

1000 

1100 

1200 

1300 

1400 

1500 

1600 

1700 

1800 

1900 

2000 

2100 

2200 

2300 

2400 

2500 

2600 

2700 

2000 

2900 

3000 

















-129.8388 

■129.8885 

■129.9148 

■129.9311 

-129.9421 

-129.9501 

■129.9362 

-129.9609 

-129.9647 

-129.9679 

-129.9703 

■129.9727 

-129.9746 

-129.9763 

-129.9778 

-129.9791 

-129.9802 

-129.9012 

-129.9022 

-129.963 

-129.9038 

-129.9045 

-129.9051 




-129.8388 
-129.9421 
-129.9647 
-129 9746 
-129.9802 
-129.9838 
-129.9862 
-129.9881 
-129.9895 
-129.9906 
-129.9913 
-129.9922 
-129.9928 
-129.9933 
-129.9938 
-129.9942 
-129.9943 
-129.9949 
-129.9931 
-129.9954 
-129.9936 
-129.9958 
-129.996 
-129.9962 
-129.9963 
-129.9965 
-129.9966 
-129.9967 
-129.9968 



-129.8885 
-129.9705 

-129 983 
-129.9881 
-129.9908 
-129.9923 
-129.9937 
-129.9943 
-129.9952 
-129.9957 
-129.9961 
-129.9965 
-129.9960 

-129.997 
-129.9972 
-129.9974 
-129.9975 
-129.9977 
-129.9970 
-129.9979 

-129.998 
-129.9901 
-129.9902 
-129.9903 
-129.9904 
-129.9904 
-129.9905 
-129.9905 
-129.9906 
-129.9906 



-129.9647 
-129.9862 
-129.9915 
-129.9938 
-129.9951 
-129.996 
-129.9966 
-129.997 
-129.9974 
-129.9977 
-129.9979 
-129.9981 
-129.9982 
-129.9983 
-129.9985 
-129.9986 
-129.9986 
-129.9987 
-129.9988 
-129.9989 
-129.9909 
-129.999 
-129.999 
-129.999 
-129.9991 
-129.9991 
-129.9992 
-129.9992 
-129.9992 
-129.9992 



0.025 

-129.9812 
-129.9918 
-129.9948 
-129.9962 
-129.997 
-129.9973 
-129.9979 
-129.9981 
-129.9984 
-129.9983 
-129.9987 
-129.9988 
-129.9989 
-129.999 
-129.999 
-129.9991 
-129.9991 
-129.9992 
-129.9992 
-129.9993 
-129.9993 
-129.9993 
-129.9994 
-129.9994 
-129.9994 
-129.9994 
-129.9995 
-129.9995 
-129.9996 
-129.9995 



F-11 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (cx)ntinued) 



rho= 



0.9 



T-tar- 



300 



S(x)= 



Standard arror of the estimated payment error rale 
0005 0.01 0,015 0.02 



100 



25 



100 

















200 














-129.8388 


300 











-129.8119 


-129.9421 


400 











-129.9194 


-129.9^47 


500 








-129.7492 


-129.9487 


-129.9746 


600 








-129.8746 


-129.9624 


-129.9802 


700 








-129.9164 


-129.9703 


-129.9838 


800 








-129.9373 


-129.9753 


-129.9862 


900 








-129.9498 


-129.9791 


-129.9881 


1000 








-129.9582 


-129.9818 


-129.9895 


1100 








-129.9642 


-129.9839 


-129.9906 


1200 





-129.8119 


-129.9687 


-129.9855 


-129.9913 


1300 





-129.8589 


-129.9721 


-129.9869 


-129.9922 


1400 





-129.8871 


-129.9749 


-129.988 


-129.9928 


1500 





-129.906 


-129.9772 


-129.9889 


-129.9933 


1600 





-129.9194 


-129.9791 


-129.9897 


-129.9938 


1700 





-129.9295 


-129.9807 


-129.9904 


-129.9942 


1800 





-129.9373 


-129.9821 


-129.991 


-129.9945 


1900 





-129,9436 


-129.9833 


-129.9916 


-129.9949 


2000 





-129.9487 


-129.9843 


-129.9921 


-129.9931 


2100 





-129.953 


-129.985? 


-129.9925 


-129.9954 


2200 





-129.9366 


-129.9861 


-129.9929 


-129.9956 


2300 





-129.9597 


-129.9868 


-129.9932 


-129.9958 


2400 





-129.9624 


-129.9875 


-129.9935 


-129.996 


2500 





-129.9647 


-129.9881 


-129.9938 


-129.9962 


2600 





-129.9668 


-129.9886 


-129.9941 


-129.9963 


2700 





-129.9687 


-129.9891 


-129.9943 


-129.9965 


2000 





-129.9703 


-129.9896 


-129.9945 


-129.9966 


2900 





-129.9718 


-129.99 


-129.9947 


-129.9967 


3000 





-129.9731 


-129.9904 


-129.9949 


-129.9968 



F-12 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho= 



0.8 



T-t)ar= 



300 



S(x)= 



60 



Standard error of the estimated payment error rate 
0.0O5 0.01 0.015 0.02 



0.025 



100 











-129.7888 


-129.9176 


200 








-129.8432 


-129.9441 


-129.9694 


300 








-129.9274 


-129.9678 


-129.9812 


400 





-129.7088 


-129.9528 


-129.9774 


-129.9864 


500 





-129.8754 


-129.965 


-129.9826 


-129.9894 


eoo 





-129.9116 


-129.9722 


-129.9858 


-129.9913 


700 





-129.9315 


-129.9769 


-129.988 


-129.9926 


800 





-129.9441 


-129.9803 


-129.9897 


-129.9936 


900 





-129.9528 


-129.9828 


-129.9909 


-129.9943 


1000 





-129.9591 


-129.9847 


-129.9919 


-129.9949 


1100 





-129.964 


-129.9863 


-129.9927 


-129.9954 


1200 





-129.9678 


-129.9876 


-129.9933 


-129.9958 


1300 





-129.9709 


-129.9886 


-129.9938 


-129.9961 


1400 





-129.9734 


-129.9895 


-129.9943 


-129.9964 


1500 





-129.9756 


-129.9902 


-129.9947 


-129.9967 


1600 


-129.781KJ 


-129.9774 


-129.9909 


-129.9951 


-129.9969 


1700 


-129.82 


-129.9789 


-129.9915 


-129.9954 


-129.9971 


1800 


-129.8432 


-129.9803 


-129.992 


-129.9956 


-129.9972 


1900 


-129.8611 


-129.9815 


-129.9924 


-129.9959 


-129.9974 


2000 


-129.8734 


-129.9826 


-129.9928 


-129.9961 


-129.9973 


2100 


-129.8869 


-129.9835 


-129.9932 


-129.9963 


-129.9976 


2200 


-129.8966 


-129.9844 


-129.9935 


-129.9964 


-129.9977 


2300 


-129.9047 


-129.9851 


-129.9938 


-129.9966 


-129.9978 


2400 


-129.9116 


-129.9858 


-129.9941 


-129.9967 


-129.9979 


2500 


-129.9176 


-129.9864 


-129.9943 


-129.9969 


-129.998 


2600 


-129.9228 


-129.987 


-129.9946 


-129.997 


-129.9981 


2700 


-129.9274 


-129.9676 


-129.9948 


-129.9971 


-129.9982 


2000 


-129.9315 


-129.988 


-129.995 


-129.9972 


-129.9982 


2900 


-129.9352 


-129.9885 


-129.9951 


-129.9973 


-129.9983 


3000 


-129.9384 


-129.9889 


-129.9953 


-129.9974 


-129.9964 



F-13 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho= 



0.8 



T-bar" 



300 



S(%> 



40 



Standard error of the estimated payment error rate 
0.005 0.01 0.015 0.02 



0.025 



100 








-129.8785 


-129.9528 


-129.9736 


200 





-129.0432 


-129.9598 


-129.9803 


-129.9881 


300 





-129.9274 


-129.9759 


-129.9876 


-129.9923 


400 





-129.9528 


-129.9028 


-129.9909 


-129.9943 


500 





-129.965 


-129.9066 


-129.9928 


-129.9955 


600 





-129.9722 


-129.9891 


-129,9941 


-129.9963 


700 





-129.9769 


-129.9907 


-129.995 


-129.9968 


800 


-129.8432 


-129.9003 


-129.992 


-129.9956 


-129.9972 


900 


-129.8785 


-129.9828 


-129.9929 


-129.9961 


-129.9975 


1000 


-129.9008 


-129.9847 


-129.9937 


-129.9965 


-129.9978 


MOO 


-129.9162 


-129.9863 


-129.9943 


-129.9968 


-129.998 


1200 


-129.9274 


-129.9076 


-129.9940 


-129.9971 


-129.9982 


1300 


-129.936 


-129.9086 


-129.9952 


-129.9973 


-129.9903 


1400 


-129.9428 


-129.9893 


-129.9955 


-129.9975 


-129.9904 


1500 


-129.9483 


-129.9902 


-129.9959 


-129.9977 


-129.9905 


1600 


-129.9520 


-129.9909 


-129.9961 


-129.9970 


-129.9906 


1700 


-129.9566 


-129.9915 


-129.9964 


-129.998 


-129.9907 


1800 


-129.9598 


-129.992 


-129.9966 


-129.9981 


-129.9988 


1900 


-129.9626 


-129.9924 


-129.9960 


-129.9982 


-129 9909 


2000 


-129.965 


-129.9928 


-129.9969 


-129.9963 


-129.9989 


2100 


-129.9671 


-129.9932 


-129.9071 


-129.9904 


-129.999 


2200 


-129.969 


-129.9935 


-129.9972 


-129.9904 


-129.999 


2300 


-129.9707 


-129.9938 


-129.9973 


-129.9905 


-129.9991 


2400 


-129.9722 


-129.9941 


-129.9974 


-129.9906 


-129.9991 


2500 


-129.9736 


-129.9943 


-129.9975 


-129.9906 


-129.9991 


2600 


-129.9740 


-129.9946 


-129.9976 


-129.9907 


-129.9992 


2700 


-129.9759 


-129.9940 


-129.9977 


-129.9907 


-129.9992 


2800 


-129.9769 


-129.995 


-129.9970 


-129.9900 


-129.9992 


2900 


-129.9779 


-129.9951 


-129.9979 


-129.9980 


-129.9992 


3000 


-129.9780 


-129.9953 


-129.990 


-129.9909 


-129.9993 



F-14 



Table of Contents 



Table F-1. Slope of the Federal gain function of the state sample size (continued) 



rho= 



08 



T-bar= 



300 



S(x)= 



too 



Standard error of the estimated payment error rate 
0005 0.01 0,015 0.02 



0.025 



100 

















200 














-129.8432 


300 











-129.8272 


-129.9274 


400 











-129.905 


-129.9328 


500 








-129.7959 


-129.9345 


-129.965 


600 








-129.8678 


-129.93 


-129.9722 


700 








-129.9022 


-129.9596 


-129.9769 


800 








-129.9224 


-129.9661 


-129.9803 


900 








-129.9357 


-129.9708 


-129.9828 


1000 








-129.9451 


-129.9743 


-129.9847 


1100 








-129.9521 


-129.9771 


-129.9863 


1200 





-129.8272 


-129.9575 


-129.9793 


-129.9876 


1300 





-129.8565 


-129.9618 


-129.9812 


-129.9886 


1400 





-129.8774 


-129.9654 


-129.9827 


-129.9895 


1500 





-129.8929 


-129.9683 


-129.984 


-129.9902 


1600 





-129.905 


-129.9708 


-129.9852 


-129.9909 


1700 





-129.9146 


-129.9729 


-129.9861 


-129.9915 


1800 





-129.9224 


-129.9747 


-129.987 


-129.992 


1900 





-129.9289 


-129.9763 


-129.9877 


-129.9924 


2000 





-129.9345 


-129.9777 


-129.9884 


-129.9928 


2100 





-129.9392 


-129.979 


-129.989 


-129.9932 


2200 





-129.9433 


-129.9801 


-129.9896 


-129.9935 


2300 





-129.9468 


-129.9811 


-129.99 


-129.9938 


2400 





-129.95 


-129.982 


-129.9903 


-129.9941 


2500 





-129.9528 


-129.9828 


-129.9909 


-129.9943 


2600 





-129.9333 


-129.9836 


-129.9913 


-129.9946 


2700 





-129.9575 


-129.9843 


-129.9916 


-129.9948 


2800 





-129.9396 


-129.9849 


-129.9919 


-129.993 


2900 





-129.9614 


-129.9855 


-129.9922 


-129.9951 


3000 





-129.9631 


-129.986 


-129.9923 


-129.9953 



F-15 



Table of Contents 



1^ 



APPENDIX G 

OPTIMUM SAMPLE SIZE FOR DISALLOWANCES 
BASED ON LOWER CONHDENCE BOUNDS 



In this appendix, we suppose that a portion of the Federal contribution 
is withheld when the lower bound of the nominal (two-sided) 90 percent (or 
95 percent) confidence interval for the payment error rate exceeds .03, and that then 
the disallowance is the fraction of the Federal contribution equal to the excess of the 
lower bound over the tolerance level .03. We use the same notation as in 
Appendix F and we also denote 

Sj. = the estimated standard error of r 



e = r-1.645sr 



The disallowance D is then given by 



»■{ 



(e-.03)U ife>.03 
0, otherwise. 



For a sample that is sufficiently large, t is approximately normally 
distributed, with mean 

|i£ = R-1.6450j. 

and variance 

2 2 92 

Oj - Oj. + (1.645)''Og - 2 X 1.645p OpOg . 

From the theorem proved in Appendix F, the expected value of D is given by 



G-l 



Table of Contents 



I&. 



Appendix G 



V^tTeCD) = a^ exp(-|ig/2a^) + (ng - .03) f" exp(-t^/2) dt 

-ue/ae 

As in Appendix F, the expected value of the gain to the Federal government is 
G = E(D) - c^n - Cjn' 

but the value of E(D) is different than in the context of Appendix F. 

We now ask whether there are sample sizes n and n' which maximize 
the expected value G of the Federal gain. As before, 

3E(D)/3<T^ >0 



and 



dG/Bn = 0E{D)/dae)0ae/5n)-Cj. 



But 



dog /dn = (l/Zo^) da^/dn 

= (l/2op) [da/dn + 2.706dcl - 3.29 a , 

since p_ „ is insensitive to variation in n. Now, since 

dajdii = (l/2a )Oq/an) 
and 

we have 

G-2 



Table of Contents 



Weatat, Inc. 



da^/dn = (l/2a^)[dc\/dn + 2.706da /dn 



- 1.645{(a^/a^)(das^/dn + (a^/apOoJ/an}] 



+ (2.706 - 1.645aJ/o dal /9n] . 

~ *r r 

This expression is difficult to evaluate analytically. It may be positive for some 
values of n and negative for others. We are able, however, to calculate E(D) and 
therefore E(G) for given values of n and n'. We have calculated the expected 
Federal gain for three values of the annual Federal dollar amount of contribution 
(20, 50, and 300 million dollars), for four levels of the population payment error rate 
R (.04, .05, .06, and .07), and for three levels of the xmit standard deviation of the 
overpayment error Ox (30, 50, and 70). These assumed values cover a reasonable 
range of the observed values of the parameters. For Population A, the value of R is 
.07297 and the value of Ox is about 70. The unit costs assumed are 

c^ = $130 = one-half of the cost of the state QC per case in 1982; and 
Cj = $330 = unit cost per case of the Federal review in 1982. 

The assumed values for the remaining parameters are: 
n'/n = .15 

Pxy = -9 

These are reasonable values according to the available data for the year ending 
September 30, 1982, and for the three test populations that we constructed. 



G-3 



Table of Contents 

-- 



Appendix G 



For the above values of the parameters and for Federal subsample sizes 
up to n'=500. Figures G-1 through G-3 show the expected Federal gain as a 
proportion of the Federal contribution. The portions of the curves for extremely 
small sample sizes should be disregarded, for the approximations used in the 
mathematical development are not acceptable for such small sample sizes. 

It will be seen from Table 3-3 in Chapter 3 of this report, and from 
Figures G-1 through G-3, that when the Federal contribution is relatively large (for 
example, $300 million or more) and the payment error rate is even moderately 
higher than the target level of .03 (say .05 or more), the expected proportion of the 
Federal contribution that is withheld increases with the size of the Federal 
subsample, assuming that the subsampling rate remains constant. The proportion 
increases quite rapidly for the smaller sample sizes but at modest rates of increase for 
sample sizes greater than about 250. The proportion disallowed increases with 
increasing values of o^. Moreover, at any sample size the proportion disedlowed is 

very small if the true payment error rate is less than 5 percent. 

For smaller Federal contributions, the proportion no longer increases 
monotonically with sample size. For high values of the payment error rate, e.g., 
R=.07, there is a sample size for which the proportion is maximum. However, the 
curve is quite flat in the neighborhood of the maximum, so that the proportion 
varies only a little over a broad range of sample sizes. If the payment error rate is 
low, say below 5 p)ercent, the Federal gain may well be negative, and increasingly 
negative as sample size increases. 

In general, then, from the point of view of maximizing the Federal 
gain from disallowances after of^tting the costs of sampling, the optimum strategy 
would be to use quite large samples if the Federal contribution is large and the true 
payment error rate is relatively high, but to use no sample otherwise. Nevertheless, 
in the latter case samples are needed to provide assurance that the error is small, in 
addition to supplying the data needed for feedback information to improve 
administration. 

Table G-1 summarizes, by states, the approximately optimum 
subsample sizes if the Federal gain from the imposition of disallowances were the 

G4 



Table of Contents 



Westat, Inc. 



only consideration in determining sample size. The numbers in the table are 
approximations using data for the last six months of fiscal year 1982, with very 
rough interpolation of the results summarized in the attached graphs. More 
accurate computations could be made for each state, but it is doubtful that it would 
be worth the effort. These results indicate that from this point of view, either no 
sample would be needed (e.g., if the state's error rate is less than 4 percent or the 
Federal contribution is quite small), or sample sizes substantially larger than those 
now used would be desirable. In some cases, no sample at all is called for, because 
the Federal contribution is so small that the potential return from disallowances 
cannot pay the cost of a sample. In other cases, no sample is called for because the 
estimated payment error rate (which was assimied here to be the true rate) was near 
or below 3 percent. Of course, the "optimum" sample allocation for a particular 
state could vary widely from year to year; the results in Table G-1 are only 
illustrative. 

We have also estimated the expected gain by simulation using the Test 
Population A, vdth 1000 replicate samples for each of three sample sizes. For that 
population, the true error rate is known, namely .07297. These simulations yielded 
the results shown in Table G-2. These results are reasonably consistent with the 
more general results based on the mathematical argument. We note that in 
Table G-2, the proportion of the Federal contribution that is returned increases with 
sample size and that the proportion is not highly sensitive to the magnitude of the 
Federal contribution. 



G-5 



Table of Contents 



Appendix G 



Table G-1. Rough approximation to optimum size of the Federal subsample if the only consideration 
were the net return from disallowances 





Optimum 




Optimum 


Slate 


sample size 


State 


sample size 


Alabama 


200 


Montana 


• 


Alaska 


• 


Nebraska 


300 


Arizona 


170 


Nevada 


* 


Arkansas 


• 


New Hampshire 


» 


California 


500+ 


New Jersey 


500+ 


Colorado 


300 


New Mexico 


200 


Connecticut 


400 


New York 


500+ 


Delaware 


300 


North Carolina 


■* 


District of Columbia 


400 


North Dakota 


• 


Florida 


350 


Ohio 


500+ 


Georgia 


350 


Oklahoma 


« 


Hawaii 


400 


Oregon 


300 


Idaho 


• 


Pennsylvania 


500+ 


Illinois 


5(Xh- 


Rhode Island 


» 


Indiana 


• 


South Carolina 


250 


Iowa 


» 


South Dakota 


• 


Kansas 


• 


Tennessee 


• 


Kentucky 


» 


Texas 


300 


Louisiana 


350 


Utah 


• 


Maine 


• 


Vermont 


» 


Maryland 


350 


Virginia 


300 


Massachusetts 


500+ 


Washington 


300 


Michigan 


500*- 


West Virginia 


300 


Minnesota 


• 


Wisconsin 


500+ 


Mississippi 


• 


Wyoming 


» 


Missouri 


• 







Note: The asterisk (*) denotes that no sample is called for because 
payment error rate is low. 



Federal contribution is low or the 



G-6 



Table of Contents 



Westat, Inc. 



Table G-2. Expected net gain from disallowances, based on simulations from Population A 



Federal 
contribution 


n' 


Expected 
gain 


Profwrtion 
returned 


$720,000,000 


180 
80 
50 


$17,457,000 

12,089,000 

8,695,000 


.024 
.017 
.012 


360,000,000 


180 
80 
50 


8,621,000 
5,998,000 
4,319,000 


.024 
.017 
.012 


180,000,000 


180 
80 
50 


4,202,000 
2,953,000 
2,132,000 


.023 
.016 
.012 


90,000,000 


180 
80 
50 


1,994,000 
1,431AX) 
1,038,000 


.022 
.016 
.012 


45,000,000 


180 
80 
50 


889,100 
669,600 
491,300 


.020 
.015 
.011 



G-7 



Table of Contents 



Figure G-1. Federal gain as proportion of Federal payment share of $20,000,000 



a: 
o 
a. 
o 

Oi. 

a. 



< 
o 



<^^ =30 



.....R^O'i' 




50 100 150 200 250 300 350 
FEDERAL SAMPLE SIZE 



400 450 500 



cr «50 




100 



150 200 250 300 350 
FEDERAL SAMPLE SIZE 



400 450 500 



S 
a. 
o 

a: 

Q. 

z 
< 




ISO 200 250 300 350 
FEDERAL SAMPLE SIZE 



500 



G-8 



Table of Contents 



Figure G-2. Federal gain as proportion of Federal payment share of $50,000,000 



o 
a. 
o 

a. 

z 
< 




100 150 200 250 300 350 400 450 500 
FEDERAL SAMPLE SIZE 




50 100 150 200 250 300 350 400 450 500 
FEDERAL SAMPLE SIZE 




150 200 250 300 350 
FEDERAL SAMPLE SIZE 



500 



G-9 



Table of Contents 



Figure G-3. Federal geiin as proportion of Federal payment share of $300,000,000 



z 
o 

(- 
ac 
o 

Q. 
O 
QC 

a. 

< 



.12 

.1 
.08 
.06 
.04 
.02 

0-1 



.02 



. . — cr=30 


„R«.Q7..... 




A .-'■■■' 

\r'" 


R-.06 




R-.05 




R-.04 




^ : 1 









50 100 150 200 250 300 350 400 450 500 

FEDERAL SAMPLE SIZE 



.12 

.1 
.08 

36 

34 

32 
04. 



g 


.06. 




?5 




>'•••••• 


d 


04. 





a. 




t" 


< 


.02. 


^. 


o 




\V. 



-.02 



<^x-50 



R-.07 



I 



R-.06 



.R-JM^ 



JiMJi£. 



• I I 



I I I 



I I I I 



50 100 150 200 250 300 350 400 450 500 
FEDERAL SAMPLE SIZE 





.08 


^ 


.06 


t- 




g 






.04 


#y 




(L 




s 


.02 


< 




o 










.02 




50 100 150 200 250 300 350 400 450 500 

FEDERAL SAMPLE SIZE 



G-10 



Table of Contents 

-- 



APPENDIX H 

RULE D FOR COMPUTING DISALLOWANCES 
BASED ON ACCUMULATIONS ACROSS YEARS 



As discussed in Section 3.6, disallowances are computed and assessed 
annually, and are subject to relatively large san\pling errors, even with the larger 
annual samples in use in the QC program in some states. These large sampling 
errors can lead to substantially overstated and iwderstated disallowances. The 
problem of large overestimates of disallowances in some years would be avoided by 
use of the lower confidence bound instead of the point estimate. However, with 
present annual sample sizes, this use would result in large losses to the Federal 
government by consistentiy and substantially understating the disallowances that 
would be assessed if the true payment error rates were known. 

A related problem with the current rule for the assessment of 
disallowances is that disallowances are assessed annually and only when the 
estimated error rate is above the target rate. Thus, because of sampling variation, a 
state may be assessed a disallowance when in fact the true payment error rate is 
equal to or below the target rate. Moreover, since negative disallowances are not 
permitted, such disallowances would not be conipensated for over time. 
Consequently, a state whose true error rate is moderately above the target rate 
would, on the average, be assessed a larger disallowance than it would be if the true 
overpayment error rate were known. Also, a state whose error rate is at or below 
but near the target rate would, on the average, be assessed disallowances. 

To eliminate or substantially reduce these problems we describe a 
procedure, referred to as Rule D, that accumulates the disallowances across years. 
This procedure has the effect (assuming approximately equal sample sizes each year) 
of doubling the sample size in two years, tripling it in three, etc., and thus over a few 
yeais greatly reduces the impact of sampling errors. A final settiement of the 
accumulated disallowances based on the point estimates is made at a time when the 



H-1 



Table of Contents 

-- 



Appendix H 



sampling errors are acceptably small. In the intervening years, cash settlements are 
assessed on the basis of the lower confidence bound of the accumulated 
disallowances. The Federal government recovers somewhat less in cash prior to the 
final settlement date but avoids greatly overassessing some states each year. The 
procedure also substantially eliminates overassessment of states with error rates 
near the tolerance. 

On a relative basis, the accumulated disallowance based on the lower 
confidence bound approaches over time the full disallowances based on the point 
estimates. Thus, while there may be a substantial reduction in the first year and a 
moderate reduction for a few years in the cash withholding bv the Federal govern- 
ment, these cash losses may be deemed acceptable in order to avoid greatly 
overassessing some states in individual years. Indeed, such a procedure might 
reduce the controversy now taking place with the states over disallowances, and in 
fact, might result in substantially greater cash collections than can be obtained by 
assessing annual disallowances based on point estimates (the present procedure), 
which leads to assessments but not to cash collections except perhaps with long 
delay 

We have developed 16 examples to illustrate the disallowances 
computed by Rule D under the differing circumstances illustrated by the exsmiples, 
and to compare them with disallowances as currently computed (Rule A). Each 
example is based on spedHc assumptions for the true error rate and other relevant 
parameters. For each example, we have computed and displayed the amovmts of 
disallowances that would be assessed over a period of 20 years under the present 
procedure for computing disallowances, and also for Rule D. The results of these 
computations appear in Tables H-1 through H-16. 

While the accumulations are carried out for 20 years in the illustrative 
examples, the accumulations could be cut off as soon as the estimated coefficient of 
variation of the acctunulated disallowance is sufficiently small, say 10 or 15 percent. 
A settlement could then be made and the accumulation process could begin again. 
The estimated coeftident of variation of the total accumulated disallowance each 
year (based on the point estimates) is shown in the last column of the tables. The 
cut-off time would be extended more or less indefinitely for states with overpay- 

H-2 



Table of Contents 



I&. 



Westat, Inc. 



ment error rates near the target (again by cutting off only if the estimated coefficient 
of variation of the accumulated disallowance is less than 10 percent or 15 percent). 
Various minor modifications of this general approach could also be considered. 

Rule D is defined more exactly and the illustrative tables are explained 
more fully in what follows. 

Let 

Aj = Federal contribution to cost in year i; 

R. = Estimated overpayment error rate m year i; 

s- = Estimated standard error of R.; and 
Rq. = Target error rate for year i. 

Rule D specifies the cumulative disallowance for year i on the basis of 
the successive point estimates, Rj, of the annual error rates, namely 

2>. = 2).., + (^..Roj)A, 

The cumulative cash transfer for year i is then based on the lower bound of the 
confidence interval for the cumulative disallowance: 



■{ 



£& j-t a (2) j) if positive, 
0, otherwise 



where we define 

The cumulative book value of the disallowance is the excess of the cumulative 
disallowance over the cumulative cash transfer, and is given by 



H-3 



Table of Contents 



1^ 



Appendix H 



58, = 2),-e,. 

Note that these formulas also apply to year 1, with the convention that all values 
are zero for year 0. 

The annual cash transfer for year i is then 
and the annual adjustment to the book disallowance is 

B, = a,-aj.^. 

Note that Cj may be a negative number. A negative C, could be 
returned to the state in cash or perhaps treated as a credit against future dis- 
allowances. The choice is, of course, a policy decision. 

The computation given above for the cumulative disallowance is 
algebraically equivalent to applying the difference between the weighted averages of 

A 

Rj and Rqj to the total Federal contribution up to and including the current year. 
The weights are the proportions that the annual Federal contributions constitute of 
the total Federal contributions. To show this, we write 



= 2)i.2-K(fej-Ro.)A..,^(^..Roj)A. 

r i A. ^ i A. 1 i 
= 1 Sit'^-iSit^J £ 



,= 1 z^. ' -» p^ ' 



H-4 



Table of Contents 



ifc. 



West at. Inc. 



Since the samples are independent from year to year, it follows that the 
variance of 2>i may be estimated by 



I 2 2 



var(2).) = I Af s^ 



The coefficient of variation is therefore estimated by 
[var(2),)]^/2 



cviSf.) = 



2). 
1 



= i[i^^4j- 



j-i ' -j 



Description of Tables 

The 16 examples presented in Tables H-1 through H-16 assume various 
true overpayment error rates and two levels of sampling error. The assumed 
parameters are shown at the bottom of each table. The examples show a 20-year 
history of estimated payment error rates. For Examples 1-12, the true payment error 
rate is assumed to be constant over the years. For Examples 13-16, the true payment 
error rates vary over the years, as displayed in the column headed "True error 
rate." 

The second and third columns, headed "Error rate" and "sigma," 
represent the observed estimates of the overpayment error rate and its standard 
error. They are derived by random selection from the joint distributions of R and 
s^ defined by the parameters shown for the exsunple. The simulation of the 

estimated error rate assumed a normal distribution of the estimated error rate, with 
the specified standard deviation. The latter corresponds approximately to the 
Federal sample size shown, and is roughly consistent with values observed in the 



H-5 



Table of Contents 

-- 



Appendix H 



QC program. The standard error of the estimated payment error rate ("sigma") was 
simulated by assuming that it was normally distributed with mean equal to the true 
standard deviation and variance given by the quantity o^ (3-l)/4n', and with P set 
equal to 40. This gives variances of Sr that roughly correspond to variances of 
estimated standard deviations observed for Test Populations A and B. The 
simulation also involves the assumption that the correlation "rho" between the 
estimated error rate and its estimated standard error is .7. This also corresponds 
roughly to the AFEXZ experience (as seen in Table C-1 in Appendix C). 

The column headed "AFDC" shows the disallowance that would be 
assessed by the present AFDC procedure (except that the negative disallowances 
shown in this column would be zeros under the present procedure). The two 
columns headed "Current Disallowance" show the amounts in the current year, 
added to or subtracted from the cumulative amounts for the previous year, as 
described above. Thus, the "Cash" column shows the amount that would be 
withheld (or perhaps disbursed or credited, if negative) in the specified year, and the 
"Book" column shows the change for the current year in the amount of the credit 
on the books. Note that the sum of the cash and book amounts is equal to the 
figures in the AFDC column, except for rounding errors. 

The remaining columns show cumulated values. The error rate 
shown is the average estimated error rate, up to and including the current year.^ 
The accumulated standard error ("sigma") is computed on the basis of each year 
providing an independent sample; i.e., the variance for a given year is computed on 
the basis of the fact that the aimual samples are indep>endent of one another and 
assuming that the square of the estimated standard error in each year is an unbiased 
estimate of the variance of the estimated payment error rate. The "Lower bound" 
for a given year is computed as the estimated error rate minus 1.645 times the 
estimated standard error for the cumulative (average) error rate, and thus is the 
lower bound of the nominal 90 percent symmetric confidence interval. Upper 

^In practice, the procedure described above for computing ttie cumulative disallowances by Rule D does 
not involve the computation of this cumulative error rate. We noted above that, implicitly, the 
effective cumulative error rate is the weighted average of the annual error rates, weighted by the 
annual Federal payments. However, since the annual Fedo'al payments are assumed to be constant in 
these illustrations, no weighting is involved. 



H-6 



Table of Contents 

We at at. Inc. 



confidence bounds are computed in a similar manner, although they play no role in 
Rule D. The cash and book accumulated disallowances are computed as described 
above. The column "Desired Disallowance" shows the accumulated disallowances 
that would be assessed under present procedures if the true error rates were known 
and used to assess the accumulated disallowance. Consequently, no credit is given 
in years in which the true error rate is less than the target rate. 

The tables illustrate how, as the overpayment error rate approaches the 
target, the estimated coefficient of variation increases, and no cash settlement is 
involved under Rule D. 



H-7 



Table of Contents 



Table H-1. Federal withholding. Rule D, Example 1 





YMI 


trr»r 
r«te 


SifM 


AFOC 


Current 
MmHomk* 


CiMMitM values 1 




Feiertl 
cMitrik 


Errer 
rete 


elfliM 


LtSMr 
beuni 


beuni 


MseltoMnce 
Cesit 1 Bask 


Desired 
Disell 


Mtell 
Errer 


cv 




CMk 1 BMk 




I 


0005 


000634 


55 


45 


10 


1,000 


00850 


0.0063 


00746 


0.0955 


45 


10 


50 


-5 


012 


2 


00762 


0.00528 


46 


43 


3 


2,000 


0.0806 


0.0041 


0.0738 


0.0874 


80 


14 


100 


-1 


0.06 


3 


0.0839 


000857 


54 


48 


6 


3,000 


0.0817 


0.0040 


0.0752 


0.0882 


135 


20 


ISO 


-5 


008 


4 


00688 


0.00451 


39 


37 


1 


4,000 


0.0785 


0.0032 


00732 


0.0837 


173 


21 


200 


1 


007 




5 


00721 


0.00337 


42 


41 


1 


5.000 


0.0772 


00026 


00729 


00815 


214 


22 


250 


14 


006 


6 


00736 


0.00443 


44 


43 


1 


6,000 


0.0766 


0.0023 


00728 


00804 


257 


23 


300 


u 


005 


7 


00728 


000458 


43 


42 


1 


7,000 


0.0761 


0.0021 


0.0726 


0.0795 


299 


24 


350 


27 


005 


8 


0.0833 


0.00779 


53 


50 


3 


8.000 


0.0770 


0.0021 


0.0736 


0.0804 


349 


27 


400 


M 


004 


I 9 


0.0842 


0.00614 


54 


52 


2 


9,000 


0.0778 


00020 


0.0746 


00810 


401 


29 


450 


21 


004 




10 


0.072 


0.0067 


42 


40 


2 


10.000 


00772 


0.0019 


00741 


00803 


441 


31 


500 


21 


004 


11 


0.0765 


0.00362 


46 


46 


1 


11,000 


0.0771 


0.0017 


0.0743 


0.0800 


487 


32 


550 


it 


0.04 


12 


0.0727 


0.00457 


43 


42 


1 


12,000 


0.0768 


0.0016 


0.0741 


0.0795 


529 


33 


600 


SI 


0.04 


13 


0.0893 


000722 


59 


57 


2 


13,000 


0.0777 


0.0016 


0.0751 


0.0804 


586 


35 


650 


21 


003 


1 14 


0.0865 


0.00678 


57 


55 


2 


14,000 


0.0784 


0.0016 


0.0758 


0.0810 


641 


36 


700 


23 


003 




15 


0.0831 


0.00647 


53 


52 


2 


15.000 


0.0767 


0.0015 


0.0762 


0.0812 


692 


38 


750 


?* 


0.03 


16 


0.0798 


0.00584 


SO 


49 


1 


16.000 


0.0788 


00015 


0.0763 


0.0812 


741 


39 


800 


H 


003 


17 


00877 


0.00621 


58 


56 


1 


17,000 


00793 


00014 


0.0769 


0.0817 


797 


40 


650 


12 


003 


18 


0.0765 


0.00571 


46 


45 


1 


18.000 


0.0791 


0.0014 


0.0768 


0.0814 


843 


41 


900 


II 


003 


19 


00787 


0.00574 


49 


46 


1 


19,000 


00791 


0.0014 


0.0769 


0.0813 


690 


43 


9S0 


17 


003 




20 


0.0778 


0.00505 


« 


47 


1 


20.000 


0.0790 


0.0015 


0.0769 


00812 


937 


44 


1000 


If 


003 



Parameters: 

True payment error rate 

Standard deviation 

Beta 

rho 

Sample size, n' 

Annual Federal contribution 



Note: "^ indicates that the coefficient of variation is 10 or greater. 



0.06 

0.006 

40 

0.7 

360 

1,000 



Table of Contents 



Table H-2. Federal withholding. Rule D, Example 2 



Ymi 




I 

2 
3 
4 
5 



Irrw 
rtte 



01021 
0714 
0.089 
0.0685 
00778 



ATK 



0.02026 
0.0124 
0.01207 
0.00685 
01181 



Cw;r«nt 
MMllWtnM 










CtHMiltltdvtIun 






FaiirtI 
cMlrik. 


Irr»r 
rite 


^ 


Uvtr 


UpNr 1 DiMllawance Desirtd 
bMM 1 Cash 1 Book Disoll. 


Diwi) 
frror 


cv 


tmk iBDrt 



72 
41 
59 
38 
48 



39 
36 

54 
37 



33 
6 
5 

1 

4 



1,000 
2.000 
3,000 
4,000 
5,000 



0.1021 
0.0867 
0.0875 
0.0827 
0.0817 



0.0203 
0.0119 
0.0089 
0.0069 
0.0060 



0.0667 
0.0672 
0.0729 
0.0714 
00719 



0.1354 
0.1063 
0.1021 
0.0940 
0.0916 



39 

74 

129 

165 

209 



33 
39 
44 
45 
49 



50 

too 

ISO 
200 
250 



-22 
•IS 
-22 
-11 
-8 



028 
021 
015 
013 
012 



6 00594 

7 0.0936 

8 0.0752 

9 0.0893 
10 0.0765 



0.01481 
0.01483 
0.01392 
001669 
0.00811 



29 
64 
45 
59 
46 



24 
58 

41 
54 
45 



6 
5 

4 
6 
1 



6,000 
7,000 
8,000 
9,000 
10.000 



0.0780 
0.0802 
0.0796 
0.0807 
0.0803 



0.0782 
0.0777 
0.0775 
0.0778 

MISL 



0.0056 
0.0052 
0.0049 
0.0047 
00043 



0.0688 
0.0716 
0.0716 
0.0729 
0.0731 



0871 
0.0888 
0.0876 
0.0884 
0.0874 



233 
291 
332 
386 
431 



55 
60 
64 
70 
71 



300 
350 
400 
450 
500 



IT 

-2 

3 

-I 
-3 



IT 
27 
32 
31 
2S 



012 
010 
010 
0.09 
0.09 



11 
12 
13 
14 
15 



0.0581 
0.0722 
0.0748 
0818 
0.0856 



0.00709 

0.01004 

0.00673 

0.0146 

0.0107 



28 
42 
45 
52 
J6. 



27 
40 
44 
48 
54 



1 
2 
1 

4 



11,000 
12,000 
13,000 
14,000 
15.000 



0.0040 
0.0037 
0.0035 
0.0034 
0.0033 



0.0717 
0.0716 
0.0718 
0.0722 
0.0730 



0.0848 
0.0839 
0.0833 
0.0834 
0.0837 



458 
499 
543 
591 
644 



72 
74 
75 
79 
81 



550 
600 
650 
700 
750 



IT 
38 
32 
41 



0.08 
0.08 
007 
0.07 
0.07 



16 0.0697 

17 0.0857 

18 0.0771 

19 0.0644 
W 9971^ 



0.01019 
0.01192 
0.00865 
0.00441 

iLUBL 



40 
56 

47 
34 

Jfi- 



38 

S3 
46 
34 



2 
2 
1 

1 



16,000 
17,000 
18,000 
19,000 

20 000 



0.0778 
0.0783 
0.0782 
0.0775 

JL2Z2L 



0.0031 
0.0030 
0.0029 
0.0028 
9W?7 



0.0727 
0.0733 
0.0734 
0.0729 
00730 



0.0829 
0.0832 
0.0830 
0.0820 
00817 



682 
736 
782 
816 
859 



82 
85 
86 
86 
87 



800 
850 
900 
950 
1000 



007 
006 
0.06 
006 
006 



Parameters: 

True payment error rate 

Standard deviation 

Beta 

rho 

Sample size, n' 

Annual Federal contribution 



Note: *** indicates that the coefficient of variation is 10 or greater. 



0.06 

0.012 

40 

0.7 

120 

1,000 



Table of Contents 



Table H-3. Federal withholding. Rule D, Example 3 



YMf 


Errw 
rtte 


St«M 


ATOC 


Cwrtflit 

DtMltaVtOM 


CttiMiltttdvtlim 1 




F«*r«l 


Err«r 
rtte 


tiflM 


LMtr 
ktURi 


btuirf 


Dittlltvtnce 
Cash 1 Btak 


DttirH 
(Mttll 


DiMll 
Irrtr 


cv 




CMk 1 Bmk 



1 


00655 


000734 


33 


21 


12 


1,000 


00635 


0.0073 


0.0514 


00755 


21 


12 


30 


-8 


022 


2 


0.0634 


0.00652 


33 


29 


4 


2,000 


00634 


0.0049 


0.0554 


0.0715 


51 


16 


60 


-7 


015 


5 


0.0697 


000668 


40 


36 


3 


3,000 


0.0655 


0.0040 


0.0590 


0.0720 


07 


20 


90 


-17 


on 


4 


0.0705 


0.00713 


40 


37 


3 


4,000 


00668 


00035 


0.0611 


00725 


124 


23 


120 


♦27 


009 


5 


0.0541 


0.00547 


24 


22 


2 


5.000 


0.0642 


0.0030 


00593 


00691 


147 


24 


150 


-21 


009 


6 


00658 


000658 


36 


34 


2 


6,000 


00645 


00027 


0.0600 


00689 


180 


27 


180 


.« 


0.08 


7 


00732 


0.00775 


43 


40 


3 


7,000 


0.0657 


0.0026 


00615 


0.0700 


220 


30 


210 


^0 


007 


8 


0.0588 


0.00549 


29 


28 


1 


8,000 


0.0649 


0.0024 


0.0610 


0.0687 


248 


31 


240 


•31 


007 


9 


0.0634 


0.00725 


33 


31 


2 


9,000 


00647 


0.0022 


0.0610 


0.0684 


279 


33 


270 


-42 


0.06 


10 


0.0600 


0.00545 


31 


30 




10,000 


0.0643 


0.0021 


0.0609 


0.0678 


309 


34 


300 


-43 


0.06 


II 


0.0543 


0.00585 


24 


23 




11,000 


0.0634 


0.0020 


0.0602 


0.0667 


332 


36 


330 


•M 


006 


12 


0.0603 


0.00693 


30 


29 


2 


12,000 


0.0632 


0.0019 


0.0600 


0.0663 


360 


36 


360 


•SI 


0.06 


IS 


0.0638 


0.00534 


34 


33 




13,000 


0.0632 


0.0018 


0.0602 


0.0662 


393 


39 


390 


-42 


005 


14 


0.0468 


0.00532 


17 


16 




14,000 


0.0620 


0.0017 


0.0592 


0.0649 


409 


40 


420 


•21 


0.05 


15 


0.0572 


0.00513 


27 


26 




15.000 


0.0617 


0.0016 


0.0590 


0.0644 


435 


40 


450 


11 


005 


16 


0.0612 


0.00655 


31 


SO 




16.000 


0.0617 


0.0016 


0.0591 


0.0643 


465 


42 


480 


•« 


005 


17 


0.0656 


000788 


36 


34 


2 


17,000 


0.0619 


0.0016 


0.0593 


0.0645 


499 


44 


510 


•32 


005 


16 


0.0673 


0.00565 


37 


36 




18,000 


0.0622 


0.0015 


0.0597 


0.0647 


535 


45 


540 


-40 


0.05 


19 


0.0506 


0.00428 


21 


20 




19,000 


0.0616 


0.0014 


0.0592 


0.0640 


555 


45 


570 


•St 


005 


20 


0P5?? 


0.W44? 


w 


2? 




?o,opo 


0.P61? 


O.OQH 


0.0590 


0.0636 


580 


46 


600. 


-?l 


0.04 



etera: 

True payment error rate 


0.06 


Standard deviation 


0.006 


Beta 


40 


rho 


OJ 


Sample size, n" 


360 


Annual Federal contribution 


1,000 



Note: *** indicates that the coeffioent of variation is 10 or greater. 



Table of Contents 



Table H-4. Federal withholding. Rule D, Example 


A 




















Ymi 


Error 
rtte 


•*• 


AfOC 


CiirrtM 
OlMltoMim 


Cumulitedvtiun 1 




F«4ir«l 1 
cMtrib. 1 


lrr«f 
rail 


"^ 


Immt 


Upper _ 
hauNi 


DiMlUnrance 
Ca»h 1 Bnk 


Otsired 
Distil. 


MmII 
Crrar 


cv 




CMk 1 BMk 




I 


0.0549 


00911 


25 


10 


15 


1,000 


0.0549 


0.0091 


0.0399 


0.0699 


10 


IS 


30 


S 


0.37 


2 


0.0656 


0.01142 


36 


27 


9 


2,000 


0.0602 


0.0073 


0.0402 


0.0722 


36 


24 


60 





024 


S 


0.0644 


0.01016 


34 


29 


5 


3.000 


0.0616 


0.0059 


0.0518 


0.0714 


66 


29 


90 


-5 


019 


4 


0.0608 


0.01428 


31 


23 


8 


4.000 


0.0614 


0.0057 


0.0520 


0.0708 


88 


38 


120 


-1 


018 


5 


0.0617 


0.01097 


32 


28 


4 


5.000 


0.0615 


0.0051 


0.0531 


0.0696 


116 


42 


150 


1 


016 


6 


0.0656 


0.01 257 


36 


31 


"5 


6.000 


0.0621 


0.0047 


0.0544 


0.0699 


146 


47 


180 


-18 


015 


7 


0.0617 


0.01567 


32 


25 


7 


7.000 


0.0621 


0.0046 


0.0545 


0.0697 


171 


53 


210 


-15 


014 


8 


0.0712 


0.01524 


41 


36 


6 


8.000 


0.0632 


0.0045 


0.0559 


0.0706 


207 


59 


240 


•28 


013 


9 


0.0613 


001274 


31 


28 


4 


9.000 


0.0630 


0.0042 


0.0561 


0.0699 


235 


62 


270 


-27 


013 


to 


0.0675 


0.01284 


37 


34 


3 


10.000 


0.0634 


0.0040 


0.0569 


0.0700 


269 


66 


300 


-M 


0.12 


n 


0.0718 


0.01374 


42 


38 


4 


11.000 


0.0642 


0.0038 


0.0579 


0.0705 


307 


70 


330 


At 


on 


12 


0.0639 


0.01588 


34 


29 


5 


12.000 


0.0642 


00038 


0.0580 


0.0704 


336 


74 


360 


-50 


on 


13 


0.065 


0.01047 


35 


33 


2 


13.000 


0.0642 


0.0036 


00584 


0.0701 


369 


76 


390 


-55 


0.10 


14 


00672 


0.01381 


37 


34 


3 


14.000 


00645 


0.0035 


0.0588 


0.0701 


403 


80 


420 


•82 


010 


15 


0.0457 


0.00558 


16 


15 


1 


15.000 


0.0632 


0.0033 


0.0579 


0.0686 


418 


80 


450 


-41 


0.10 


16 


0.0743 


0.01442 


44 


41 


3 


16.000 


0.0639 


0.0032 


0.0587 


0.0691 


459 


84 


480 


42 


009 


17 


0.0813 


0.01207 


51 


49 


2 


17.000 


0.0649 


0.0031 


0.0599 


0.0700 


508 


86 


510 


•M 


0.09 


18 


0.0616 


0.01232 


32 


29 


2 


18,000 


0.0647 


0.0030 


0.0598 


0.0696 


537 


88 


540 


-8S 


009 


19 


0.0634 


0.01654 


33 


29 


4 


19.000 


0.0647 


0.0030 


0.0598 


0.0695 


566 


92 


570 


-81 


009 


20 


0.0661 


0.01333 


36 


34 


A 


20.000 


0.0647 


0.P929 


0.0600 


0.0695 


600 


95 


600 


•85 


0Q8 


Parameters: 














Note: " 


*** indicates that the coefficient of variation 


is 10 or 


greater. 




True payment error rale 


0.06 


























Stand 


ard devia 


tion 


0.012 


























Beta 






40 


























rho 






0.7 


























Samp 
Annu 


le size, n' 
al Federal 


contribution 


120 
1,000 

























Table of Contents 



Table H-5. 1 


Federal withholding. Rule D, Example 


5 




















YMl 


Err»r 
rate 


HflflN 


AFDC 


CyrrtMt 
MMlltwMn 










CimuiiM values 










Fa^ral 
ceiitrib. 


Error 
rate 


"^ 


Lever 
bouni 


teuni p 


DisellewaMe 
Cask 1 Book 


Desired 
Msell 


Oisall 
Error 


cv 


Cmk 1 BMk 




1 00585 


000543 


8 





8 


1,000 


00385 


0.0054 


0.0295 


00474 





8 


10 


2 


064 


2 0.0461 


0.00707 


18 


12 


6 


2,000 


0.0433 


0.0045 


0.0359 


0.0506 


12 


15 


20 


7 


034 


3 0.039 


0.00523 


9 


7 


2 


3,000 


0.0418 


0.0034 


0.0362 


0.0475 


18 


17 


30 


-5 


029 


4 0.0315 


0.00364 


1 





1 


4,000 


0.0392 


0.0027 


0.0347 


00438 


19 


18 


40 


3 


030 


5 0.0386 


000517 


9 


7 


2 


5.000 


0.0392 


0.0024 


0.0352 


0.0431 


26 


20 


SO 


4 


026 


6 00418 


000581 


12 


10 


2 


6,000 


00396 


0.0022 


0.0359 


0.0433 


36 


22 


60 


2 


023 


7 0.0303 


0.00404 





-1 


1 


7,000 


0.0383 


0.0020 


0.03S0 


0.0416 


35 


23 


70 


12 


024 


8 00401 


000604 


fO 


8 


2 


8,000 


00385 


0.0019 


0.0354 


0.0416 


43 


25 


80 


12 


0.22 


9 00369 


000491 


7 


6 


1 


9,000 


00383 


0.0018 


0.0354 


00413 


49 


26 


90 


IS 


021 


10 0.035 


000537 


5 


4 


1 


10.000 


0.0380 


0.0017 


0.0352 


0.0408 


52 


28 


100 


21 


0.21 


n 0.03S2 


0.00577 


5 


4 


2 


11,000 


0.0377 


0.0016 


0.0351 


0.0404 


56 


29 


110 


2S 


21 


12 0.052 


0.00822 


22 


19 


3 


12,000 


0.0389 


0.0016 


0.0362 


0.0416 


75 


32 


120 


13 


018 


13 0.0288 


000542 


-I 


-2 


1 


13,000 


00362 


0.0016 


0.0356 


0.0407 


72 


34 


130 


24 


019 


14 0368 


0.0043 


7 


6 


1 


14,000 


0.0381 


0.0015 


0.0356 


0.0405 


78 


34 


140 


27 


018 


(5 0.0392 


0.00584 


9 


8 


1 


5.000 


00381 


0.0014 


00358 


0.0405 


66 


36 


ISO 


21 


018 


16 00477 


000635 


18 


16 


2 


6,000 


0.0387 


0.0014 


0.0364 


0.0410 


102 


37 


160 


ii 


016 


17 0.0418 


000637 


12 


10 


1 


17,000 


00389 


0014 


00366 


0.0412 


113 


39 


170 


11 


015 


18 0.0442 


0.00644 


14 


13 


1 


18,000 


0.0392 


0.0014 


0.0370 


0.0414 


126 


40 


180 


M 


IS 


19 0.0399 


0.0063 


10 


9 


1 


19,000 


0.0392 


0.0013 


0.0371 


0.0414 


134 


41 


190 


H 


014 


20 0.0463 


0.00686 


16 


15 


J. 


20.000 


0.0396 


0.0013 


0.0374 


0.0417 


149 


43 


200 


1 


0.14 


Parameters: 














Note: *** indicates that the coefficient of variation 


is 10 or 


greater. 


True payment error rate 


0.04 
























Stand 


ard devia 


don 


0.006 
























Beta 






40 
























rho 






0.7 
























Samp] 


le size, n' 




360 


























Annu 


al Federal 


contribution 


1,000 

























Table of Contents 



Table H-6. Federal withholding. Rule D, Example 6 



Year 


CrrM- 
rtte 


Xfm 


AfOC 


CwntA 


CwnultlMlvtliin 1 


FaiirtI 
cMliik. 


Errtr 
rate 


•^"* 


IfHMr 

htMMd 


Upper _ 


DiMllevtnce 
Cnh 1 BMk 


Dm red 
MmII 


MnII 
Errtr 


cv 


tmk 1 


BMk 




1 


0.0433 


0.01036 


IS 





13 


1,000 


0.0433 


0.0104 


0.0263 


0.0604 





13 


10 


-S 


0.78 


2 


0040S 


0.00997 


11 





10 


2,000 


0.0419 


0.0072 


0.0301 


00537 





24 


20 


-4 


0.60 


5 


0.0287 


0.00517 


-1 





-1 


3,000 


0.0375 


0.0051 


0.0291 


0.0459 





23 


30 


7 


068 


4 


0.0738 


0.0183 


44 


27 


17 


4.000 


0.0466 


0.0060 


0.0368 


0.0564 


27 


39 


40 


-21 


036 


5 


0.0306 


0.00772 


1 


-1 


2 


5.000 


0.0434 


0.0050 


0.0351 


0.0516 


26 


41 


50 


-17 


037 


6 


0.0247 


0007S7 


-5 


-7 


2 


6,000 


0.0403 


0.0044 


0.0331 


00475 


19 


43 


60 


-i 


042 


7 


0.0483 


0.01305 


18 


13 


5 


7,000 


0.0414 


0.0042 


0.0346 


0.0403 


32 


48 


70 


-10 


0.37 


8 


0.0503 


0.01138 


20 


17 


4 


8,000 


0.0425 


00039 


0.0361 


0.0490 


49 


52 


60 


-28 


0.31 


9 


0.0366 


0.00908 


7 


5 


2 


9,000 


0.0419 


0.0036 


00359 


00479 


53 


54 


90 


-17 


031 


to 


0.0391 


0.01332 


9 


5 


4 


10.000 


0.0416 


0.0035 


0.0358 


0.0474 


50 


56 


100 


-11 


030 


II 


0.0543 


0.01586 


24 


19 


6 


11,000 


0.0428 


0.0035 


0.0370 


0.0485 


77 


64 


no 


-38 


028 


12 


0.033 


0.01354 


3 


-1 


4 


12.000 


0.0419 


0.0034 


0.0363 


0.0476 


76 


67 


120 


-23 


029 


13 


0.0127 


0.00849 


-17 


-19 


1 


13,000 


0.0397 


0.0032 


0.0344 


0.0450 


57 


69 


130 


4 


033 


14 


0.0367 


0.01018 


7 


5 


2 


14,000 


0.0395 


0.0031 


0.0344 


0.0445 


62 


71 


140 


7 


32 


15 


0.0581 


0.01267 


28 


25 


3 


15.000 


0.0407 


0.0030 


0.0350 


0.0456 


87 


74 


150 


-n 


028 


16 


00293 


0.00336 


-1 


-1 





16.000 


0.0400 


0.0028 


0.0354 


0.0446 


86 


74 


160 


8 


028 


17 


0.0642 


0.02001 


34 


27 


7 


17.000 


0.0414 


0.0029 


00367 


0.0462 


113 


81 


170 


-24 


025 


18 


0.0461 


0.01662 


16 


12 


4 


18,000 


00417 


0.0029 


0.0369 


0.0464 


125 


86 


180 


•31 


025 


19 


0.0214 


0.00754 


-9 


-10 


1 


19,000 


0.0406 


00028 


00361 


0.0452 


115 


66 


190 


-12 


026 


20 


0.02?5 


PW? 


■f 


-? 


1 


20.000 


0.0397 


0.0026 


0.0354 


0.0441 


107 


67 


200 


1 


027 



Parameters: 




True payment error rale 


0.04 


Standard deviation 


0.012 


Beta 


40 


rho 


0.7 


Sample size, n' 


120 


Annual Federal contribution 


1,000 



Note: •** indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Table H-7. Federal withholding. Rule D, Example 7 



Ymi 


Error 
rat* 




AFDC 


CiirrtRl 

MMitaVWRt 


CwRulalid values 1 


Floral 1 
CMlrik 1 


Errar 
rata 




Lawar 


baund 


DIaallaMnca 
Cask 1 Btok 


Oasired 
Disall 


Disall 
Errar 


cv 


CMk 1 Bnk 




1 


00288 


0.0057 


-1 


-1 


1,000 


00288 


00057 


00194 


0.0381 





-1 


3 


4 


465 




0.0306 


0.00548 


1 


1 


2,000 


0.0297 


0.0040 


0.0232 


0.0362 





-1 


6 


7 


••• 




00314 


0.00553 


1 


1 


3,000 


0.0303 


0.0032 


00250 


0.0356 





1 


9 


1 


•»« 




0.0269 


0.00464 


-5 


-3 


4,000 


0.0294 


0.0027 


0.0250 


0.0338 





-2 


12 


14 


462 




0.02S 


0.00535 


-5 


-5 


5.000 


0.0285 


0.0024 


00246 


0.0325 





-7 


15 


22 


163 




0.0357 


0.00609 


6 


6 


6,000 


00297 


00022 


0.0260 


0.0334 





-2 


18 


it 


796 




0.0421 


0.00684 


12 


12 


7.000 


0.0315 


0.0022 


00279 


0.0350 





to 


21 


11 


145 




0.0297 


0.00469 








8,000 


0.0313 


0.0020 


0.0280 


0.0345 





10 


24 


14 


156 




00288 


0.005S9 


-1 


-1 


9,000 


0.0310 


0.0019 


00279 


0.0341 





9 


27 


11 


187 


10 


0.0381 


0.00621 




8 


10.000 


0.0317 


0.0018 


0.0288 


0.0346 





17 


30 


19 


1.05 


11 


0.0269 


0.00646 


-3 


-3 


11,000 


0.0313 


0.0017 


0.0284 


00341 





14 


33 


11 


136 


12 


0.0375 


0.00649 




8 


12,000 


0.0318 


0.0017 


0.0290 


0.0345 





21 


36 


IS 


094 


13 


0.0272 


000513 


-3 


-3 


13,000 


0.0314 


0.0016 


0.0288 


0.0341 





19 


39 


21 


111 


14 


0.0375 


0.00707 




8 


14.000 


0.0319 


0.0016 


0.0293 


0.0344 





26 


42 


11 


084 


IS 


00348 
0.0173 


0.00578 




5 


15.000 


0.0321 


0.0015 


0.0296 


0.0345 





31 


45 


14 


0.73 


16 


0.0047 


-13 


-13 


16,000 


0.0311 


0.0014 


0.0288 


0.0335 





18 


48 


M 


127 


17 


0.0384 


0.00702 




8 


17,000 


0.0316 


0.0014 


0.0292 


0.0339 





27 


51 


24 


0.90 


18 


0.0207 


0.0059 


-9 


-9 


16,000 


0.0310 


0.0014 


0.0287 


0.0332 





17 


54 


37 


143 


19 


0.0312 


0.00559 




1 


19.000 


0.0310 


0.0013 


0.0288 


0.0332 





19 


57 


SI 


137 


20 


0.9422 


0.0067 


13 


13 


20.000 


0.0316 


QW«? 


0.0294 


0.0337 





32 


60 


?• 


0.84 


Pari 


imeters: 












Note: •*• indicates that the coefficient of variation 


is 10 or greater. 




True payment error rate 


0.033 
























SUnd 


ard devia 


lion 


0.006 
























Beta 






40 
























rho 






0.7 
























Samp 
Annu 


e size, n' 
al Federal 


n>nthbudon 


360 
1,000 























Table of Contents 



Table H-8. Federal withholding. Rule D, Example 8 



Ywr 


£rr»r 
rtle 


SlWM 


ATOC 


Currtirt 

DiMltoVtlM 


CwRultMvtlues 1 


Fadtral 
CMtrlk 


Err«r 
rate 


stfim 


iMwr 
lauRi 


Ufper |_ 
toHnd p 


DiMllavance 
Cnh 1 Book 


Desired 
Disoll 


DiMll 

Errer 


cv 


tmk 1 


BMk 




1 


00249 


001007 


• ^k 





-5 


1,000 


0.0249 


00101 


00083 


0.0415 





-5 


3 


1 


1.96 




0.0464 


0.01336 


16 





16 


2.000 


0.0357 


0.0084 


0.0219 


0.0494 





11 


6 


-5 


1 47 




0.0277 


0.01 395 


-2 





-2 


3.000 


0.0330 


0.0073 


0.021 1 


0.0450 





9 


9 





241 




0.0237 


0.0124 


-6 





-6 


4.000 


0.0307 


0.0063 


0.0204 


0.0410 





3 


12 


9 


923 




0.0241 


0.01433 


-6 





-6 


5.000 


0.0294 


0.0058 


0.0199 


0.0389 





-3 


15 


11 


916 




OOSO 


00067 









6,000 


00295 


0.0049 


0.0214 


00376 





-3 


16 


21 


969 




0.024S 


0.00936 


-6 





-6 


7.000 


0.0288 


0.0044 


0.0215 


0.0361 





-9 


21 


SO 


3.63 




0.02S8 


0.00855 


-6 





-6 


8.000 


0.0282 


0.0040 


0.0215 


0.0348 





-15 


24 


SI 


2.19 




00536 


0.01358 


24 





24 


9.000 


0.0310 


0.0039 


0.0246 


0.0374 





9 


27 


tl 


3.95 




0.0338 


0.01014 









0.000 


0.0313 


0.0036 


0.0253 


0.0373 





13 


30 


17 


2.87 




0.0314 


001429 


1 





I 


1.000 


0.0313 


0.0036 


0.0254 


0.0371 





14 


33 


19 


2 78 




0.0293 


0.01296 


-I 





-1 


12,000 


0.03 11 


0.0034 


0.0255 


0.0368 





13 


36 


2S 


308 




0.029 


0.01213 


-1 





-1 


13.000 


0.0310 


0.0033 


0.0255 


0.0364 





12 


39 


27 


345 




0.0663 


0.01572 


36 





36 


14.000 


0.0335 


0.0033 


0.0281 


0.0389 





49 


42 


-7 


094 




0.0301 


0.01491 










15.000 


0.0333 


0.0032 


0.0280 


0.0365 





49 


45 


-4 


0.99 




0.0267 


0.00848 


-3 





-3 


16.000 


0.0328 


00031 


0.0278 


0.0379 





46 


46 


2 


107 




0.0326 


0.01182 


3 







17,000 


0.0328 


0.0030 


0.0280 


0.0377 





48 


51 


S 


1.05 




0.0560 


0.01634 


27 





27 


18,000 


0.0342 


0.0029 


0.0293 


0.0390 





75 


54 


-21 


071 




0.0371 


0.01461 


7 







19,000 


0.0343 


0.0029 


0.0296 


0.0391 





82 


57 


-2S 


0.67 


20 


0.0156 


0.00795 


-14 





-M 


20.000 


0.0334 


0.0028 


0.0288 


0.0379 





66 


60 


•1 


0.62 



Parameters: 

True payment error rale 0-033 

Standard deviation 0.012 

Beta 40 

rho 0.7 

Sample size, n' 120 

Annual Federal contribution 1,000 



Note: *** indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Table H-9. Federal withholding. Rule D, Example 9 





I 
2 
3 
4 
5 



ATK 



0.0341 
0.025 
0.0349 
0.0303 
0.0275 



000672 
0.00597 
0.00662 
0.00736 
0.007S 



CttrnWL 
MMltovtMi 



C«k I Bnk 



4 

•5 

5 


-3 











4 

■5 

S 



■3 



F«*rtl 

CMttlil. 



1,000 
2.000 
3,000 
4.000 
5.000 



Error 
rate 



0.0341 
0.0296 
0.0313 
0311 
0.0304 



CumuHtodytlim 



bMirf I btuad I Cash | Book | Mooll { Error 



0.0067 
0.0045 
0.0037 
0.0033 
0.0031 



0.0230 
0.0222 
0.0252 
0.0256 
0.0253 



00451 
0.0369 
0.0375 
0.0366 
0.0354 











4 
-I 

4 
4 
2 











-4 
1 

-1 
-4 



cv 



I 65 
••• 

2.77 
3.07 
631 



6 00327 

7 0.0229 

8 0.0411 

9 0.0296 
10 0.0267 



tl 
12 
13 
14 
15 



000574 
0.00611 
0.00669 
0.00666 
0.005 



00266 
0.0193 
0.0318 
00328 
0031 



0.00561 
0.00442 
0.00652 
0.00647 
0.00495 



3 
-7 
II 


-3 



T 
II 

2 
3 
I 











3 

-7 

II 



-3 



6,000 
7,000 
8,000 
9,000 
10.000 



0.0308 
0.0296 
0.0311 
0.0309 
0.0305 



0.0027 
0.0025 
0.0023 
00022 
0.0020 



0.0263 
0.0255 
0.0272 
0.0273 
0.0271 



00352 
0.0337 
0.0349 
00345 
0.0339 











5 
■3 
9 
8 
5 











-3 

-It 
2 
3 
I 



11,000 
12.000 
13,000 
14,000 
15.000 



0.0301 
0.0292 
0.0294 
0.0297 
0.0298 



0.0019 
0.0018 
0.0017 
0.0017 
00016 



0.0270 
0.0263 
0.0266 
0.0269 
0.0271 



0.0333 
0.0322 
0.0323 
0.0324 
0.0324 











I 
-9 
-7 
-5 
-4 



















362 
690 
218 
2.44 
4.23 



2 34 
305 
5 14 
669 



16 


0.0341 


0.00732 


4 





4 


16,000 


0.0300 


0.0016 0.0274 


0.0326 





1 





.) ••» 


17 


0.027 


00615 


-3 





-3 


17.000 


00299 


00015 0.0273 


0.0324 





-2 





2 "• 


to 


0201 


0.00516 


-10 





-10 


18,000 


0.0293 


0.0015 0.0269 


0.0317 





-12 





12 213 


19 


0.0397 


0.00699 


to 





to 


19,000 


0.0299 


0.0014 0.0275 


0.0322 





-3 





J •** 


20 


00319 


009^ 


2 





? 


20.090 


O.93OP 


0.0014 0.0276 


00323 





-1 





1 *«• 



Parameters: 




True paymoit eiror rale 
Sundard deviation 
Beta 
rho 


0.03 

0.006 

40 

0.7 


Sample size, n" 

Annual Federal contribution 


360 
1,000 



Note: *** indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Table H-10. Federal withholding. Rule D, Example 10 



iw 


Irrtr 
rtte 


•ifOH 


AfDC 


Currtiil 


Cumiiittoiivtlun 1 




cMtrtb. 


Crrer 
rito 


SiflM 


Ltwer 


Upper 1 
beuei | 


Meellevence 
Cash 1 Book 


Desired 
Mean 


Oiaoll 
Error 


cv 




C«li 1 


BMk 




1 


0.0297 


0.01107 











1,000 


00297 


0111 


0.0115 


00479 














tt* 


2 


0.0153 


0.01068 


-15 





-15 


2,000 


0.0225 


0.0077 


0.0090 


0.0351 





-15 





IS 


102 


S 


0.0443 


0.01591 


14 





14 


3,000 


0.0298 


0.0074 


0.0176 


0.0419 





-1 





1 


if* 


4 


0.0567 


0.01551 


27 





27 


4,000 


0.0365 


0.0068 


O02S4 


0.0476 





26 





-21 


104 


5 


0.0279 


OO 031 


-2 





-2 


5.000 


0.0348 


0.0058 


0.0253 


0.0443 





24 





-24 


121 


6 


0.0348 


0.0 451 


5 





5 


6,000 


0.0348 


0.0054 


0.0259 


0.0437 





29 





•21 


t 13 


7 


0.009 


0.00564 


-21 





-21 


7,000 


0.031 1 


0.0047 


0.0234 


0.0388 





8 





-1 


4.24 


8 


0.0653 


0.01938 


35 





35 


8,000 


0.0354 


0.0048 


0.0275 


0.0432 





43 





'43 


089 


9 


0.041 1 


0.01314 


n 





11 


9,000 


0.0360 


0.0045 


0.0286 


0.0434 





54 





-S4 


074 


to 


0.0170 


0.01002 


-12 





-12 


10.000 


0.0342 


0.0042 


0.0274 


0.0410 





42 





-42 


099 


n 


0.0187 


0.012 


-11 





-11 


11.000 


0.0328 


0.0039 


0.0263 


0.0393 





31 





-31 


1.41 


12 


0.042 


0.01023 


12 





12 


12.000 


0.0336 


0.0037 


0.0275 


0.0396 





43 





•43 


1.04 


13 


0.0428 


0.01491 


13 





13 


13,000 


0.0343 


0.0036 


0.0283 


0.0402 





55 





•55 


005 


14 


0.0195 


0.00788 


-10 





-10 


14,000 


0.0332 


0.0034 


0.0276 


0.0388 





45 





•4S 


106 


15 


0.0233 


0.0079 


-7 





-7 


15.000 


0.0325 


0.0032 


0.0273 


0.0376 





36 





-31 


1.26 


16 


0.0214 


0.00805 


-9 





-9 


16,000 


0.0319 


0.0031 


0.0268 


0.0369 





30 





H 


165 


17 


0.0326 


0.01118 


3 





3 


17,000 


0.0319 


0.0029 


0.0270 


0.0367 





32 





-32 


155 


18 


0.0396 


0.0133 


10 





10 


18,000 


0.0323 


0.0029 


0.0276 


00371 





42 





-42 


1.24 


19 


0.0128 


0.00783 


-17 





-17 


19,000 


0.0313 


0.0028 


0.0268 


0.0356 





25 





-2S 


212 


20 


9W?9 


0.01357 


-' 


9 


-] 


?o,w 


0.0312 


9W7 


90?^. 


0.0356 





24 





-|4 


2.29 



eters: 

True payment error rate 


0.03 


Standard deviation 


0.012 


Beta 


40 


rho 


0.7 


Sample size, n' 


120 


Annual Federal contribution 


1,000 



Note: *** indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Table H-11. Federal withholding. Rule D, Example 11 



n 



Errtr 

rtte 



ttfM 



ATDC 



CwrrtM 

MMllMtRn 



CMk I BMk 



CMtlib 



Err»r 
rtte 



9if iM I LM«r I U^r 



ClWMrittti VtlMW 



hiuni I iamt I Cask 



Oi» rtkv»Bce| 
I BMk I 



Desirtd 
MmII 



MmII 
Error 



cv 





1 


0.022 


000672 


-6 


-8 


1,000 


0.0220 


0.0067 


0.0110 


0.0331 





-8 


-5 


3 


084 


2 


0.0208 


0.00642 


-9 -2 


-7 


2,000 


0.0214 


0.0046 


00138 


0.0291 


-2 


-15 


-10 


7 


054 


3 


00267 


0.00656 


-3 


-3 


3.000 


0.0232 


0.0038 


0.0169 


0.0294 


-2 


-19 


-15 


S 


0.56 


4 


00173 


0.0068 


-13 -10 


-3 


4,000 


0.0217 


0.0033 


00163 


0.0272 


-11 


-22 


-20 


IS 


040 


5 


0.0327 


0.00773 




-3 


S.000 


0.0239 


0.0031 


0.0189 


0.0290 


-5 


-25 


-25 


s 


050 


6 


00188 


000504 


-11 -10 


-1 


6,000 


00231 


0.0027 


0.0186 


0.027S 


-15 


-27 


-30 


ti 


039 


7 


0.0226 


0.00652 


-7 -5 


-2 


7,000 


0.0230 


0.0025 


0.0189 


0.0271 


-20 


-29 


-35 


14 


036 


8 


0.0234 


0.0068 


-7 -4 


-2 


8,000 


0.0231 


0.0023 


0.0192 


0.0269 


-25 


-31 


-40 


18 


0.34 


9 


00128 


0.00394 


-17 -16 


-1 


9,000 


0.0219 


0.0021 


0.0184 


0.0254 


-41 


-31 


-45 


21 


026 


10 


0.0256 


0.00678 


-4 -2 


-2 


10.000 


0.0223 


0.0020 


0.0189 


0.0256 


-44 


-33 


-50 


27 


026 


II 


0.0245 


0.00584 


-6 -4 


.-1 


11,000 


0.0225 


0.0019 


0.0193 


0.0256 


-48 


-35 


-55 


u 


0.26 


12 


0.0263 


0.00576 


-4 -2 


-1 


12.000 


0.0228 


0.0018 


0.0198 


0.0258 


-50 


-56 


-60 


21 


0.25 


13 


0.0372 


0.00725 




-2 


13,000 


0.0239 


00018 


0.0210 


0.0268 


-41 


-38 


-65 


14 


0.29 


14 


0.0249 


000588 


-5 -4 


-1 


14,000 


0.0240 


0.0017 


0.0212 


0.0268 


-45 


-39 


-70 


14 


0.28 


15 


0.0236 


0.00425 


-6 -6 


-1 


15.000 


0.0240 


0.0016 


0.0213 


0.0266 


-51 


-40 


-75 


1 


027 


16 


0.0281 


0.006SS 


-2 -1 


-1 


16,000 


0.0242 


0.0016 


0.0216 


0.0268 


-51 


-41 


-80 


\ 


027 


17 


0.0191 


0.00486 


-11 -10 


-1 


17,000 


0.0239 


0.001S 


0.0215 


0.0264 


-62 


-42 


-85 


11 


0.25 


18 


00235 


0.0067 


-7 -5 


-1 


18,000 


0.0239 


0.0015 


0.0215 


0.0263 


-67 


-43 


-90 


20 


024 


19 


0.0207 


0.0052 


-9 -8 


-1 


19,000 


0.0237 


0.0014 


00214 


0.0260 


-75 


-44 


-95 


24 


022 


20 


0.0?I5 


0005?$ 


-8 -7 


-1 


20.00Q 


0.0?36 


0.P0I4 


0.0214 


0.0259 


-83 


-45 


-100 


?i 


0.21 



Parameters: 




True payment error rate 


0.025 


Standard deviation 


0.006 


Beta 


40 


rho 


0.7 


Sample size, n' 


360 


Annual Federal contribution 


1,000 



Note: *** indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Tabl 


e H-12. 


Federal withholding. Rule D, Example 12 




















VMf 


error 
rate 


•ffim 


ATOC 


CyrrtM 
MmHomum 


Cttnulttiivalws I 


Fi^trtl 
ciMlrib 


Error 
rote 


9i9IM 


Lowor 
houni 


■sufw 


DiNllovonM 
Cnh 1 Book 


Oesirotf 
Oisall 


DiMll 
Error 


cv 


Cmk 1 


BMk 




1 


0.03 


0.01569 











1,000 


0.0300 


0.0159 


0.0039 


0.0562 








-5 


-5 


••• 




0.0321 


0.01311 


2 





2 


2,000 


0.031 1 


0.0103 


0.0142 


0.0460 





2 


-10 


-12 


941 




0.0163 


0.00753 


-14 





-14 


3.000 


0.0262 


0.0073 


0.0141 


0.0362 





-12 


-15 


-8 


191 




0.0343 


0.01 236 


4 





4 


4,000 


0.0282 


0.0063 


0.0178 


0.0366 





-7 


-20 


-IS 


350 




0.0186 


0.00956 


-11 





-11 


5.000 


0.0263 


0.0054 


0.0174 


0.0351 





-19 


-25 


•< 
4 " 


145 




0.0145 


0.01014 


-15 





-15 


6,000 


0.0243 


0.0046 


0.0164 


0.0322 





-34 


-30 


085 




0.0423 


0.01276 


12 





12 


7,000 


0.0269 


0.0045 


0.0195 


0.0343 





-22 


-35 


-13 


145 




0.0167 


000724 


-13 





-13 


8,000 


0.0256 


0.0040 


0.0190 


0.0323 





-55 


-40 


-5 


092 




0.0282 


000842 


-2 





-2 


9,000 


0.0259 


0.0037 


0.0198 


0.0320 





-37 


-45 


•« 


91 


10 


0.0406 


0.01401 


11 





11 


10.000 


0.0274 


0.0037 


0.0214 


0.0334 





-26 


-50 


-24 


140 


n 


0.0316 


0.01293 


2 





2 


11,000 


0.0278 


00035 


0.0220 


0.0336 





-24 


-55 


-St ~ 


1.59 


12 


0.0222 


0.00971 


-8 





-8 


12,000 


0.0273 


0.0033 


0.0216 


0.0328 





-32 


-60 


•21 


124 


13 


0165 


0.01127 


-12 





-12 


13,000 


0.0266 


0.0032 


0.0214 


0.0319 





-44 


-65 


-21 


095 


14 


0.0326 


0.01342 


3 





3 


14,000 


0.0271 


0.0031 


0.0219 


00322 





-41 


-70 


-21 


106 


15 


0.0207 
0.04 


0.01003 
0.01 304 


-9 





-9 


15.000 


0.0266 


0.0030 


0.0217 


0.0315 





-50 


-75 


■25 


089 


16 


10 





10 


16,000 


0.0275 


0.0029 


0.0227 


0.0323 





-40 


-80 


41 


1 15 


17 


0.0496 


0.01913 


20 





20 


17,000 


0.0288 


0.0030 


0.0239 


0.0337 





-21 


-85 


-M 


242 


16 


0.0007 


0.0073 


-21 





-21 


18,000 


0.0277 


0.0026 


0.0230 


0.0323 





-42 


-90 


41 


121 


19 


0.0474 


0.01625 


17 





17 


19,000 


00287 


0.0028 


0.0241 


0.0333 





-25 


-95 


-Tl 


217 


20 


0.0306 


0.01507 


1 





1 


20.000 


po??? 


p.0928 


0.0242 


0.0334 





-24 


-100 


-71 


2.31 


Par. 


imeters: 














Note: **♦ indicates that the coefficient of variation 


is 10 or greater. 




True payment error rate 


0.025 
























Stand 


ard devia 


don 


0.012 
























Beta 








40 
























rho 








0.7 
























Samp 
Annu 


le size, n" 
al Federal 


contribution 


120 
1,000 























Table of Contents 



Table HI 3. Federal withholding. Rule D, Example 13 




I 
2 
3 
4 
5 




0.075 
0.0612 
00639 
0.0488 
0421 



>1|N» 



0.00694 
0.00520 
0.00857 
0.00451 
0.003S7 



AFDC 



Zwntu 


CumuMedviluM 


Faiird Err«r aitiM Uwtr UMtr DlMllMMce Dnirti True MmI) cv 
cMliik rtlk feMNrf bMiii Cttk | BmIc MmII einirnte Error 


CMk 1 BMk 



45 
31 
S4 
19 
12 



35 
28 
28 
17 
II 



10 
3 

6 
I 
I 



1,000 
2,000 
3,000 
4,000 
5.000 



0.0750 
0.0681 
0.0667 
0.0622 
0.0582 



00063 
0.0041 
00040 
00032 
0.0026 



0.0646 
0.0615 
0.0602 
0.0570 
0.0539 



0.O8S5 
0.0749 
0.0732 
0.0675 
0.0625 



35 
63 
90 

toe 

119 



to 

14 
20 
21 
22 



40 

75 

105 

135 

155 



0070 
0065 
0060 
0.060 
0050 



-5 
-I 
-5 
6 
14 



1409 
106} 

oioei 

00987 
00934 



6 00438 

7 0.0428 

8 0.053S 

9 0.0492 
10 0037 



0.00443 
0.00458 
0.00779 
0.00614 
00067 



14 
13 
23 
19 
7 



13 
12 
20 
17 
5 



I 
t 
3 
2 
2 



6,000 
7,000 
8.000 
9,000 
10.000 



00S58 
0.0539 
0.0S39 
00533 
0.0517 



00023 
0.0021 
0.0021 
0.0020 
0.0019 



0.0520 
0.0505 
0.0505 
0.0501 
0.0486 



00596 
0.0574 
0.0573 
0.0566 
0.0548 



132 
144 
164 
181 
186 



23 
24 
27 
29 
31 



175 
195 
215 
230 
245 



0050 
0.050 
0050 
0.045 
0.045 



20 
27 
24 
20 
28 



00897 
00673 
00868 
0.0841 
0.067 



11 0.0365 

12 00327 

13 0493 

14 0.041 S 

15 0.0381 



16 
17 
18 
19 
20 



000362 
0.00457 
000722 
0.00678 
000647 



0.0296 
00377 
0.0265 
00237 



0.00584 
0.00621 
0.00571 
0.00574 



6 

3 

19 

12 

8 



6 

8 

-4 

-6 
-7 



6 

2 

17 

10 

7 



1 
1 
2 
2 
2 



11,000 
12,000 
13.000 
14,000 
15.000 



0.0505 
0.0489 
0.0489 
0.0484 
0.0477 



0.0017 
0.0016 
0.0016 
0.0016 
0.0015 



0.0475 
0.0461 
0.0462 
00458 
0.04S2 



0.0532 
0.0516 
0.0516 
0.0510 
0.0502 



192 
194 
211 
221 
227 



32 
33 
35 
36 
38 



255 
265 
275 
280 
285 



-1 
6 
-5 
-7 
•8 



16,000 
17,000 
18,000 
19,000 

Msas. 



0.0466 
00460 
0.0450 
0.0438 



6.0015 
0.0014 
0.0014 
0.0014 
00013 



0.0441 
00437 
0.0426 
0.0416 
0.0406 



0.0490 
00484 
0.0473 
0.0461 
0.0450 



226 
232 
228 
220 
212 



39 
40 
41 
43 



285 
285 
285 
280 
275 



0.040 
0.040 
0040 
0035 
0.035 



0030 
0.030 
0030 
0025 
0.025 



31 
39 
29 
23 
20 



0.0861 
0.0874 
00857 
0086 
00869 



20 0.0897 

12 00901 

16 0.0937 

17 0.0984 
19 01037 



Parameters: 

Varying payment error rate 

Standard deviation 0.006 

Beta 40 

rho 0.7 

Sample size, n' 360 

Annual Federal contribution 1,000 



Note: ••• indicates that the coefficient of variation is 10 or greater. 



Table of Contents 



Table H-14. 


FMeral w 


ithholding. Rule D, Example 14 






















Ym) Errtr 


SIMM 


AfOC 


CwrrtRl 


CimuWedviiuM i 


rail 


F«*r«l 
cMlrlk. 


Irnr 


•4|M 


iMMr 


kMNl 




0Mira4 
Matll. 


True 
anrornle 


Err«r 


cv 




tA \%mk 




i 0.0921 


0.02026 


62 


29 


55 


1,000 


0.0921 


0.0203 


0.0587 


0.1254 


29 


55 


40 


0070 


-22 


035 


2 0.0564 


0.0124 


26 


21 


6 


2,000 


0.0742 


0.0119 


0.0547 


0.0958 


49 


59 


75 


0.065 


-13 


027 


3 0.069 


0.01207 


39 


34 


5 


3,000 


0.0725 


0.0069 


0.0579 


0.0871 


84 


44 


105 


0060 


-22 


21 


4 0.0485 


0.00685 


18 


17 


1 


4.000 


0.0664 


0.0069 


0.0551 


0.0777 


too 


45 


155 


0060 


-n 


019 


5 00476 


0.01161 


18 


14 


4 


S.000 


0.0627 


0.0060 


0.0529 


0.0726 


114 


49 


155 


O.OSO 


-9 


Die 


6 0.0294 


001481 


-1 


-6 


"6 


6,000 


0.0572 


0.0056 


00480 


0.0665 


108 


55 


175 


0050 


12 


021 


7 0.0656 


0.01483 


54 


28 


5 


7.000 


0.0561 


0.0052 


0.0495 


0.0667 


136 


60 


195 


0050 


-2 


019 


8 0.0452 


0.01592 


15 


11 


4 


8.000 


0.0565 


0.0049 


0.0484 


0.0645 


147 


64 


21s 


0.050 


3 


GIB 


9 0.0545 


0.01669 


24 


19 


6 


9,000 


0.0562 


0.0047 


0.0485 


0.0640 


166 


70 


250 


0.045 


-6 


016 


10 0.0415 


0.00611 
0.00709 


11 
-12 


10 
-13 


1 
1 


10.000 


0.0548 


0.0043 


0.0476 


0.0619 


176 


71 


245 


0045 


-3 


017 


11 0.0181 


11,000 


0.0514 


0.0040 


0.0449 


00580 


163 


72 


255 


0040 


19 


019 


12 0322 


0.01004 


2 





2 


12,000 


0.0498 


0.0057 


0.0457 


0.0560 


164 


74 


265 


0040 


27 


019 


13 0348 


0.00673 


5 


4 


1 


13,000 


0.0487 


00055 


0.0429 


0.0544 


168 


75 


275 


0040 


32 


019 


14 0368 


0.0146 


7 


3 


4 


14,000 


0.0476 


0.0054 


0.0422 


0.0554 


171 


79 


280 


0035 


31 


019 


15 0.0406 


0.0107 


11 


9 


2 


15.000 


0.0473 


0.0035 


0.0420 


0.0527 


179 


81 


285 


0.035 


25 


0.19 


16 0.0197 


0.01019 


-18 


-12 


2 


16,000 


0.0456 


0.0051 


0.0405 


0.0506 


167 


82 


285 


0.030 


35 


020 


17 0.0357 


0.01192 


6 


5 


2 


17,000 


0.0450 


0.0050 


0.0401 


0.0500 


171 


85 


285 


0.030 


30 


020 


18 0.0271 


0.00865 


-5 


-4 


1 


18.800 


0.0440 


0.0029 


0.0595 


0.0468 


167 


86 


285 


0.030 


32 


021 


19 0.0094 


0.00441 


-21 


-21 





19,000 


0.0422 


0.0028 


0.0577 


0.0467 


146 


86 


280 


0.025 


46 


0.23 


20 0.0196 


0.00892 


-10 


-12 


_L 


20.000 


0.0411 


0.09?7 


0.0567 


0.9454 


154 


87 


275 


0.025 


53 


024 


Parameters: 














Note: *** indicates that the coefficient of variation 


is 10 or greater. 




Vaiyii 


igpaymer 


It error rate 




























Stand 


ard devial 


ion 


0.012 


























Beta 






40 


























rho 






0.7 


























Samp 
Annu 


e size, n' 
il Federal < 


:ontribution 


120 
1,000 



























Table of Contents 



Table H-15. 


Federal w 


ithholding, R 


ule D, Exam 


pU 


J 15 






















YMf 


£rr»r 
rate 


tiMM 


AfDC 


Ciirrtiil 
MmUmmm 


CtmuMedviiun 1 


Fatfiral 
CMtrlk. 


trr»r 
rtif 


MMM 


Uv*r 


Of per L 
hMirf 


MMltovMcel 
Cash 1 BMk 1 


DMirstf 
Diaall 


True 
eirorrale 


DiMll 
Error 


cv 


Ua 1 8wk 




t 00S5 


0.00634 


25 


15 


10 


1,000 


00550 


0.0063 


0.0446 


00655 


15 


10 


40 


0070 


15 


025 


2 0.0412 


00OS28 


11 


8 


3 


2,000 


0.0481 


0.0041 


0.0413 


0.0549 


23 


14 


75 


0.065 


39 


023 


I 0.0489 


0.00857 


19 


IS 


6 


3,000 


0.0484 


0.0040 


0.0418 


0.0549 


35 


20 


105 


0.060 


50 


022 


4 0.0288 


0.00451 


-1 


-3 




4,000 


0.0435 


0.0032 


0.0382 


0.0487 


33 


21 


135 


0060 


81 


024 


5 0.0321 


0.00337 


2 


1 




5.000 


0.0412 


0.0026 


0.0369 


0.0455 


34 


22 


155 


0.050 


99 


0.24 


6 0.0288 


0.00443 


-1 


-2 




6,000 


0.0391 


0.0023 


0.0353 


0.0429 


32 


23 


175 


0.050 


120 


025 


7 00278 


0.00458 


-2 


-3 




7,000 


0.0375 


0.0021 


0.0341 


0.0410 


29 


24 


195 


0.050 


142 


026 


8 0.0383 


0.00779 


8 


5 




8,000 


0.0376 


0.0021 


0.0342 


0.0410 


34 


27 


215 


0.050 


154 


027 


9 00342 


0.00614 


4 


2 




9,000 


0.0372 


0.0020 


0.0340 


0.0405 


36 


29 


230 


0.045 


165 


027 


10 0.022 


0.0067 


8 


-10 




10.000 


0.0357 


0.0019 


0.0326 


0.0388 


26 


31 


245 


0.045 


188 


033 


11 0.0265 


0.00362 


-4 


-4 




11,000 


0.0349 


0.0017 


0.0320 


0.03^ 


22 


32 


255 


0.040 


201 


036 


12 0.0177 


000457 


-12 


-13 




12,000 


0.0334 


0.0016 


0.0307 


0.0562 


9 


35 


265 


0.040 


224 


048 


13 0.0343 


000722 


4 


2 




13,000 


0.0335 


0.0016 


0.0308 


0.0362 


11 


35 


275 


0040 


229 


046 


14 0315 


0.00678 


2 







14,000 


0.0334 


0.0016 


0.0308 


0.0360 


(1 


36 


280 


0035 


235 


047 


15 0.0281 


0.00647 


-2 


-3 




15.000 


0.0330 


0.0015 


0.0^ 


0.0355 


7 


36 


285 


0.035 


240 


051 


16 0.0298 


000S84 





-1 




16,000 


0.0328 


0.0015 


0.0304 


0.0353 


6 


39 


285 


0.030 


240 


053 


17 00377 


0.00621 


8 


6 




17,000 


0.0331 


0.0014 


0.0307 


0.0355 


12 


40 


285 


0030 


232 


047 


18 0.0265 


0.00571 


-4 


-5 




18,000 


0.0327 


0.0014 


0.0304 


0.0350 


6 


41 


285 


0.030 


236 


51 


19 0.0237 


O.0OS74 


-6 


-7 




19,000 


0.0323 


0.0014 


0.0300 


0.0345 





43 


280 


0025 


237 


060 


20 0.0228 


0.00S8S 


-7 





ll 


20.000 


0.0318 


0.0013 


0.0296 


0.0340 





36 


275 


0025 


239 


074 


Parameters: 














Note: ' 


'•• indicates that the coefficient of variation 


is 10 or greater. 




Vaiyi 


ngpaymei 


It error rale 




























Stand 


ard devia 


tion 


0.006 


























Beta 






40 


























rho 






0.7 


























Samp 


le size, n' 




360 




























Annu 


al Federal 


contribution 


1,000 



























Table of Contents 



Table H-16. 


Federal w 


ithholding, R 


Lile D, Example 


16 






















YMfj Irnr 


•*" 


AFDC 


Citrrtit 

MMltoMMt 


CumuWadvilUM 1 


rtto 


r«4»r»l 
tmkrik. 


ErrM- 1 
rtto 1 


MflM 


LMMr 


Itwrf 


CMh 1 Bwk 


Deslrtd 
MmII. 


True 
Mrornte 


DiMll 
Error 


cv 




UJk \hmk 




1 0.0721 


0.02026 


42 


9 33 


1,000 


0.0721 


00203 


0.0387 


01054 


9 


33 


40 


0070 


-2 


048 


2 0.0364 


0.0124 


6 


1 6 


2,000 


0.0542 


0.0119 


0.0347 


0.0738 


9 


39 


75 


0.065 


27 


049 


3 0.054 


0.01207 


24 


19 5 


3,000 


O0541 


0.0089 


0.0395 


00687 


29 


44 


105 


0060 


33 


037 


4 0.0283 


000685 


-2 


-3 1 


4.000 


00477 


0.0069 


0.0364 


0.0590 


25 


45 


135 


0060 


64 


039 


5 0.0376 


0.01181 


8 




5.000 


0.0457 


0.0060 


0.0359 


0.0556 


29 


49 


155 


0050 


76 


038 


6 0144 


001481 


-16 


-21 6 


6,000 


O0405 


0.0056 


0.0313 


0.0496 


8 


55 


175 


0050 


112 


053 


7 0.0486 


0.01483 


19 


13 5 


7,000 


0.0416 


0.0052 


0.0331 


0.0502 


21 


60 


195 


0050 


IIS 


045 


8 0.0302 


0.01392 





-4 4 


8,000 


0.0402 


0.0049 


0.0322 


0.0483 


17 


64 


215 


0050 


133 


048 


9 0.0393 


01669 


9 




9,000 


0.0401 


0.0047 


00323 


0.0479 


21 


70 


230 


0045 


139 


047 


to 0265 


0.00011 


-4 


-5 1 


10.000 


0.0388 


0.0043 


0.0316 


0.0459 


16 


71 


245 


0045 


157 


049 


tl 0.0081 


0.00709 


-22 


-16 -6 


11,000 


0.0360 


0.0040 


0.0294 


0.0425 





66 


255 


0040 


189 


067 


12 0.0172 


0.01004 


-15 


-13 


12,000 


0.0344 


00037 


0.0282 


0.0406 





53 


265 


0.040 


212 


085 


13 0198 


000673 


-10 


-10 


13,000 


O0333 


0.0035 


0.0275 


0.0390 





43 


275 


0.040 


232 


107 


14 0268 


0.0146 


-3 


-3 


14,000 


00328 


0.0034 


00272 


0.0384 





39 


280 


0035 


241 


1 21 


15 0.0306 


0.0107 


1 


1 


15.000 


0.0327 


0.0033 


0.0273 


0.0380 





40 


285 


0.035 


245 


1 22 


16 0.0197 


0.01019 


-10 


-10 


16,000 


0.0319 


00031 


00267 


0.0370 





30 


285 


0030 


255 


1 68 


17 0.0357 


0.01192 


6 


6 


17,000 


0.0321 


0.0030 


0.0271 


0.0371 





35 


285 


0030 


250 


145 


18 0.0271 


0.00865 


-3 


-3 


18,000 


0.0318 


0.0029 


0.0270 


0.0366 





33 


285 


0030 


252 


1 60 


19 0.0094 


000441 


-21 


-21 


19,000 


0.0306 


0.0028 


0.0261 


0.0352 





12 


280 


0025 


268 


4 36 


20 0.0196 


000892 


-10 


-10 


20.000 


0.0301 


0.0027 


00257 


0.0344 





2 


275 


0025 


273 *•• 


Parameters: 










Note: *** indicates that the coefficient of variation is 10 or greater. 




Varyir 


tg paymen 


I error rate 


























Standi 


ird deviat 


ion 


0.012 
























Beta 






40 
























rho 






0.7 
























Sampi 
Aimiu 


e size, n' 
il Federal c 


ontribution 


120 
1,000 

























Table of Contents 

-- 



APPENDIX I 

EFFECT OF SUBSTITUnNG T FOR T IN ESTIMATING 
OVERPAYMENT ERROR RATES 



The estimator of the overpayment error rate ii\ current use, R, given by 
Equation (1) in Chapter 1 of the report and by Appendix B, involves the quantity F, 
the average AFDC payment per case as estimated from the state sample. In the 

original proposal for the regression estimator, T, the average AFDC payment per 
case in the complete caseload of the state in the specified time period was used 

instead of t , the estimate of f from the state sample. This raises questions with 
regard to the statistical efficiency of the estimator R, based on t , and the validity of 
the estimator of its variance. This appendix examines these questions. 

The evaluation was done by simulating the sampling and estimating 
procedures for Population A. For each of three sample sizes, 1000 samples were 
drawn. ^ In each of the samples, the regression estimator and three difference 
estimators (using three values of the coefficient k; see Appendix B) were computed, 

using t and also using f . The variation of the estimates over the 1000 samples 

2 
provided estimates of the variances of the alternative estimators, denoted o_ - and 

^ x"/t 

2 

a_ _ -. The results are shown in Table I-l. For both the regression and the difference 

estimators, the variances of the estimates of R do not differ greatly. For the 
regression estimator the relative difference is only 8 to 10 percent, which 
corresponds to a relative difference in the standard errors of only about 4 or 
5 percent. 



^The sample sizes used for these simulations were different, and generally smaller, than those used in 
later simulations. The reason was that these and certain other simulations were done early, with 
sample sizes more representative of six month samples, chosen to illustrate what happens with 
relatively smaller samples than the annual samples currently in use. 



I-l 



Table of Contents 

-- 



Appendix I 



Moreo' er, the variances of the estimates that use t are moderately 

smaller than of those that use T. This is because the coefficient of variation of t is 
small and the estimated average overpayment per case, x", is positively correlated 
with the average AFEXZ payment p>er case. The relative variance of the ratio of two 
random variables u and v is given by 



V^/ = V^+V^-2pVV 
u/v uv ►^u^ 



Here, p denotes the correlation between u amd v. If the denominator v 

is a constant (which is the case when T is used), then the relvariance of the ratio 

2 
reduces to V^ since Vy=0. If the denominator v is not a constant but a variable 

(which is the case when t is used), the relvariance of the ratio depends upon the 
value of the quantity V^ - 2pVuVy. The use of a variable v will produce a smaller 

variance than the use of a constant v if p > Vy/V^. Since in our case the coefficient 
of variation of t is far less than the coefficient of variation of x", it does not require a 
very large value of the correlation p to give the use of t a small advantage. 
Consequently, we have the fortimate result that the more convenient estimator has 
a somewhat smaller variance and is not only appropriate but recommended. 



1-2 



Table of Contents 



Weatat, Inc. 



Table I-l. Comparison of variances of x"/ 1 and x"/T for Population A (Variances are shown times 10^) 



Sample size 




i^x'Vf 


i^xVt 


Ratio 


(n/n') 


Estimator 


(1) 


(2) 


(l)/(2) 


1200/180 


Regression 


1.397 


1.297 


1.08 




Difference, k=l 


1.383 


1.307 


1.06 




Difference, k=.9 


1.393 


1.309 


1.06 




Difference, k=.8 


1.445 


1.351 


1.07 


500/80 


Regression 


3.136 


2.897 


1.08 




Difference, k-1 


3.004 


2.938 


1.02 




Difference, k=.9 


3.004 


2.940 


1.02 




Difference, k=.8 


3.117 


3.030 


1.03 


300/50 


Regression 


5.176 


4.696 


1.10 




Difference, k«l 


4.923 


4.786 


1.03 




Difference, k».9 


4.981 


4.791 


1.04 




Difference, k«.8 


5.209 


4.937 


1.06 



1-3 



